Can quantum probability provide a new direction for cognitive modeling?

Emmanuel M. Pothos; Jerome R. Busemeyer

doi:10.1017/S0140525X12001525

Can quantum probability provide a new direction for cognitive modeling?

Published online by Cambridge University Press: 14 May 2013

Emmanuel M. Pothos and

Jerome R. Busemeyer

Show author details

Emmanuel M. Pothos: Affiliation:
Department of Psychology, City University London, London EC1V 0HB, United Kingdom. emmanuel.pothos.1@city.ac.ukhttp://www.staff.city.ac.uk/~sbbh932/
Jerome R. Busemeyer: Affiliation:
Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405. jbusemey@indiana.eduhttp://mypage.iu.edu/~jbusemey/home.html

Article contents

Abstract
Preliminary issues
Basic assumptions in QP theory and psychological motivation
The empirical case for QP theory in psychology
General issues for the QP models
The rational mind
Concluding comments
References

Rights & Permissions

Abstract

Classical (Bayesian) probability (CP) theory has led to an influential research tradition for modeling cognitive processes. Cognitive scientists have been trained to work with CP principles for so long that it is hard even to imagine alternative ways to formalize probabilities. However, in physics, quantum probability (QP) theory has been the dominant probabilistic approach for nearly 100 years. Could QP theory provide us with any advantages in cognitive modeling as well? Note first that both CP and QP theory share the fundamental assumption that it is possible to model cognition on the basis of formal, probabilistic principles. But why consider a QP approach? The answers are that (1) there are many well-established empirical findings (e.g., from the influential Tversky, Kahneman research tradition) that are hard to reconcile with CP principles; and (2) these same findings have natural and straightforward explanations with quantum principles. In QP theory, probabilistic assessment is often strongly context- and order-dependent, individual states can be superposition states (that are impossible to associate with specific values), and composite systems can be entangled (they cannot be decomposed into their subsystems). All these characteristics appear perplexing from a classical perspective. However, our thesis is that they provide a more accurate and powerful account of certain cognitive processes. We first introduce QP theory and illustrate its application with psychological examples. We then review empirical findings that motivate the use of quantum theory in cognitive theory, but also discuss ways in which QP and CP theories converge. Finally, we consider the implications of a QP theory approach to cognition for human rationality.

Keywords

category membership classical probability theory conjunction effect decision making disjunction effect interference effects judgment quantum probability theory rationality similarity ratings

Type: Target Article
Information: Behavioral and Brain Sciences , Volume 36 , Issue 3 , June 2013 , pp. 255 - 274

DOI: https://doi.org/10.1017/S0140525X12001525 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2013

1. Preliminary issues

1.1. Why move toward quantum probability theory?

In this article we evaluate the potential of quantum probability (QP) theory for modeling cognitive processes. What is the motivation for employing QP theory in cognitive modeling? Does the use of QP theory offer the promise of any unique insights or predictions regarding cognition? Also, what do quantum models imply regarding the nature of human rationality? In other words, is there anything to be gained, by seeking to develop cognitive models based on QP theory? Especially over the last decade, there has been growing interest in such models, encompassing publications in major journals, special issues, dedicated workshops, and a comprehensive book (Busemeyer & Bruza Reference Busemeyer and Bruza2012). Our strategy in this article is to briefly introduce QP theory, summarize progress with selected, QP models, and motivate answers to the abovementioned questions. We note that this article is not about the application of quantum physics to brain physiology. This is a controversial issue (Hammeroff 2007; Litt et al. Reference Litt, Eliasmith, Kroon, Weinstein and Thagard2006) about which we are agnostic. Rather, we are interested in QP theory as a mathematical framework for cognitive modeling. QP theory is potentially relevant in any behavioral situation that involves uncertainty. For example, Moore (Reference Moore2002) reported that the likelihood of a “yes” response to the questions “Is Gore honest?” and “Is Clinton honest?” depends on the relative order of the questions. We will subsequently discuss how QP principles can provide a simple and intuitive account for this and a range of other findings.

QP theory is a formal framework for assigning probabilities to events (Hughes Reference Hughes1989; Isham Reference Isham1989). QP theory can be distinguished from quantum mechanics, the latter being a theory of physical phenomena. For the present purposes, it is sufficient to consider QP theory as the abstract foundation of quantum mechanics not specifically tied to physics (for more refined characterizations see, e.g., Aerts & Gabora Reference Aerts and Gabora2005b; Atmanspacher et al. Reference Atmanspacher, Römer and Walach2002; Khrennikov Reference Khrennikov2010; Redei & Summers Reference Redei and Summers2007). The development of quantum theory has been the result of intense effort from some of the greatest scientists of all time, over a period of >30 years. The idea of “quantum” was first proposed by Planck in the early 1900s and advanced by Einstein. Contributions from Bohr, Born, Heisenberg, and Schrödinger all led to the eventual formalization of QP theory by von Neumann and Dirac in the 1930s. Part of the appeal of using QP theory in cognition relates to confidence in the robustness of its mathematics. Few other theoretical frameworks in any science have been scrutinized so intensely, led to such surprising predictions, and, also, changed human existence as much as QP theory (when applied to the physical world; quantum mechanics has enabled the development of, e.g., the transistor, and, therefore, the microchip and the laser).

QP theory is, in principle, applicable not just in physics, but in any science in which there is a need to formalize uncertainty. For example, researchers have been pursuing applications in areas as diverse as economics (Baaquie Reference Baaquie2004) and information theory (e.g., Grover Reference Grover1997; Nielsen & Chuang Reference Nielsen and Chuang2000). The idea of using quantum theory in psychology has existed for nearly 100 years: Bohr, one of the founding fathers of quantum theory, was known to believe that aspects of quantum theory could provide insight about cognitive process (Wang et al., in press). However, Bohr never made any attempt to provide a formal cognitive model based on QP theory, and such models have started appearing only fairly recently (Aerts & Aerts Reference Aerts and Aerts1995; Aerts & Gabora Reference Aerts and Gabora2005b; Atmanspacher et al. Reference Atmanspacher, Filk and Romer2004; Blutner Reference Blutner, Acardi, Adenier, Fuchs, Jaeger, Khrennikov, Larsson and Stenholm2009; Bordley Reference Bordley1998; Bruza et al. Reference Bruza, Kitto, Nelson and McEvoy2009; Busemeyer et al. 2006b; Busemeyer et al. Reference Busemeyer, Pothos, Franco and Trueblood2011; Conte et al. Reference Conte, Khrennikov, Todarello, Federici, Mendolicchio and Zbilut2009; Khrennikov Reference Khrennikov2010; Lambert-Mogiliansky et al. Reference Lambert-Mogiliansky, Zamir and Zwirn2009; Pothos & Busemeyer Reference Pothos and Busemeyer2009; Yukalov & Sornette Reference Yukalov and Sornette2010). But what are the features of quantum theory that make it a promising framework for understanding cognition? It seems essential to address this question before expecting readers to invest the time for understanding the (relatively) new mathematics of QP theory.

Superposition, entanglement, incompatibility, and interference are all related aspects of QP theory, which endow it with a unique character. Consider a cognitive system, which concerns the cognitive representation of some information about the world (e.g., the story about the hypothetical Linda, used in Tversky and Kahneman's [Reference Tversky and Kahneman1983] famous experiment; sect. 3.1 in this article). Questions posed to such systems (“Is Linda feminist?”) can have different outcomes (e.g., “Yes, Linda is feminist”). Superposition has to do with the nature of uncertainty about question outcomes. The classical notion of uncertainty concerns our lack of knowledge about the state of the system that determines question outcomes. In QP theory, there is a deeper notion of uncertainty that arises when a cognitive system is in a superposition among different possible outcomes. Such a state is not consistent with any single possible outcome (that this is the case is not obvious; this remarkable property follows from the Kochen–Specker theorem). Rather, there is a potentiality (Isham Reference Isham1989, p. 153) for different possible outcomes, and if the cognitive system evolves in time, so does the potentiality for each possibility. In quantum physics, superposition appears puzzling: what does it mean for a particle to have a potentiality for different positions, without it actually existing at any particular position? By contrast, in psychology, superposition appears an intuitive way to characterize the fuzziness (the conflict, ambiguity, and ambivalence) of everyday thought.

Entanglement concerns the compositionality of complex cognitive systems. QP theory allows the specification of entangled systems for which it is not possible to specify a joint probability distribution from the probability distributions of the constituent parts. In other words, in entangled composite systems, a change in one constituent part of the system necessitates changes in another part. This can lead to interdependencies among the constituent parts not possible in classical theory, and surprising predictions, especially when the parts are spatially or temporally separated.

In quantum theory, there is a fundamental distinction between compatible and incompatible questions for a cognitive system. Note that the terms compatible and incompatible have a specific, technical meaning in QP theory, which should not be confused with their lay use in language. If two questions, A and B, about a system are compatible, it is always possible to define the conjunction between A and B. In classical systems, it is assumed by default that all questions are compatible. Therefore, for example, the conjunctive question “are A and B true” always has a yes or no answer and the order between questions A and B in the conjunction does not matter. By contrast, in QP theory, if two questions A and B are incompatible, it is impossible to define a single question regarding their conjunction. This is because an answer to question A implies a superposition state regarding question B (e.g., if A is true at a time point, then B can be neither true nor false at the same time point). Instead, QP defines conjunction between incompatible questions in a sequential way, such as “A and then B.” Crucially, the outcome of question A can affect the consideration of question B, so that interference and order effects can arise. This is a novel way to think of probability, and one that is key to some of the most puzzling predictions of quantum physics. For example, knowledge of the position of a particle imposes uncertainty on its momentum. However, incompatibility may make more sense when considering cognitive systems and, in fact, it was first introduced in psychology. The physicist Niels Bohr borrowed the notion of incompatibility from the work of William James. For example, answering one attitude question can interfere with answers to subsequent questions (if they are incompatible), so that their relative order becomes important. Human judgment and preference often display order and context effects, and we shall argue that in such cases quantum theory provides a natural explanation of cognitive process.

1.2. Why move away from existing formalisms?

By now, we hope we have convinced readers that QP theory has certain unique properties, whose potential for cognitive modeling appears, at the very least, intriguing. For many researchers, the inspiration for applying quantum theory in cognitive modeling has been the widespread interest in cognitive models based on CP theory (Anderson Reference Anderson1991; Griffiths et al. Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010; Oaksford & Chater Reference Oaksford and Chater2007; Tenenbaum et al. Reference Tenenbaum, Kemp, Griffiths and Goodman2011). Both CP and QP theories are formal probabilistic frameworks. They are founded on different axioms (the Kolmogorov and Dirac/von Neumann axioms, respectively) and, therefore, often produce divergent predictions regarding the assignment of probabilities to events. However, they share profound commonalities as well, such as the central objective of quantifying uncertainty, and similar mechanisms for manipulating probabilities. Regarding cognitive modeling, quantum and classical theorists share the fundamental assumption that human cognition is best understood within a formal probabilistic framework.

As Griffiths et al. (Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010, p. 357) note, “probabilistic models of cognition pursue a top-down or ‘function-first’ strategy, beginning with abstract principles that allow agents to solve problems posed by the world … and then attempting to reduce these principles to psychological and neural processes.” That is, the application of CP theory to cognition requires a scientist to create hypotheses regarding cognitive representations and inductive biases and, therefore, elucidate the fundamental questions of how and why a cognitive problem is successfully addressed. In terms of Marr's (Reference Marr1982) analysis, CP models are typically aimed at the computational and algorithmic levels, although perhaps it is more accurate to characterize them as top down or function first (as Griffiths et al. Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010, p. 357).

We can recognize the advantage of CP cognitive models in at least two ways. First, in a CP cognitive model, the principles that are invoked (the axioms of CP theory) work as a logical “team” and always deductively constrain each other. By contrast, alternative cognitive modeling approaches (e.g., based on heuristics) work “alone” and therefore are more likely to fall foul of arbitrariness problems, whereby it is possible to manipulate each principle in the model independently of other principles. Second, neuroscience methods and computational bottom-up approaches are typically unable to provide much insight into the fundamental why and how questions of cognitive process (Griffiths et al. Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010). Overall, there are compelling reasons for seeking to understand the mind with CP theory. The intention of QP cognitive models is aligned with that of CP models. Therefore, it makes sense to present QP theory side by side with CP theory, so that readers can appreciate their commonalities and differences.

A related key issue is this: if CP theory is so successful and elegant (at least, in cognitive applications), why seek an alternative? Moreover, part of the motivation for using CP theory in cognitive modeling is the strong intuition supporting many CP principles. For example, the probability of A and B is the same as the probability of B and A (Prob(A&B)=Prob(A&B)). How can it be possible that the probability of a conjunction depends upon the order of the constituents? Indeed, as Laplace (1816, cited in Perfors et al. Reference Perfors, Tenenbaum, Griffiths and Xu2011) said, “probability theory is nothing but common sense reduced to calculation.” By contrast, QP theory is a paradigm notorious for its conceptual difficulties (in the 1960s, Feynman famously said “I think I can safely say that nobody understands quantum mechanics”). A classical theorist might argue that, when it comes to modeling psychological intuition, we should seek to apply a computational framework that is as intuitive as possible (CP theory) and avoid the one that can lead to puzzling and, superficially at least, counterintuitive predictions (QP theory).

Human judgment, however, often goes directly against CP principles. A large body of evidence has accumulated to this effect, mostly associated with the influential research program of Tversky and Kahneman (Kahneman et al. Reference Kahneman, Slovic and Tversky1982; Tversky & Kahneman Reference Tversky and Kahneman1973; Reference Tversky and Kahneman1974; Tversky & Shafir Reference Tversky and Shafir1992). Many of these findings relate to order/context effects, violations of the law of total probability (which is fundamental to Bayesian modeling), and failures of compositionality. Therefore, if we are to understand the intuition behind human judgment in such situations, we have to look for an alternative probabilistic framework. Quantum theory was originally developed so as to model analogous effects in the physical world and therefore, perhaps, it can offer insight into those aspects of human judgment that seem paradoxical from a classical perspective. This situation is entirely analogous to that faced by physicists early in the last century. On the one hand, there was the strong intuition from classical models (e.g., Newtonian physics, classical electromagnetism). On the other hand, there were compelling empirical findings that were resisting explanation on the basis of classical formalisms. Therefore, physicists had to turn to quantum theory, and so paved the way for some of the most impressive scientific achievements.

It is important to note that other cognitive theories embody order/context effects or interference effects or other quantum-like components. For example, a central aspect of the gestalt theory of perception concerns how the dynamic relationships among the parts of a distal layout together determine the conscious experience corresponding to the image. Query theory (Johnson et al. Reference Johnson, Haubl and Keinan2007) is a proposal for how value is constructed through a series of (internal) queries, and has been used to explain the endowment effect in economic choice. In query theory, value is constructed, rather than read off, and also different queries can interfere with each other, so that query order matters. In configural weight models (e.g., Birnbaum Reference Birnbaum2008) we also encounter the idea that, in evaluating gambles, the context of a particular probability-consequence branch (e.g., its rank order) will affect its weight. The theory also allows weight changes depending upon the observer perspective (e.g., buyer vs. seller). Anderson's (Reference Anderson1971) integration theory is a family of models for how a person integrates information from several sources, and also incorporates a dependence on order. Fuzzy trace theory (Reyna Reference Reyna2008; Reyna & Brainerd Reference Reyna and Brainerd1995) is based on a distinction between verbatim and gist information, the latter corresponding to the general semantic qualities of an event. Gist information can be strongly context and observer dependent and this has led fuzzy trace theory to some surprising predictions (e.g., Brainerd et al. Reference Brainerd, Reyna and Ceci2008).

This brief overview shows that there is a diverse range of cognitive models that include a role for context or order, and a comprehensive comparison is not practical here. However, when comparisons have been made, the results favored quantum theory (e.g., averaging theory was shown to be inferior to a matched quantum model, Trueblood & Busemeyer Reference Trueblood and Busemeyer2011). In some other cases, we can view QP theory as a way to formalize previously informal conceptualizations (e.g., for query theory and the fuzzy trace theory).

Overall, there is a fair degree of flexibility in the particular specification of computational frameworks in cognitive modeling. In the case of CP and QP models, this flexibility is tempered by the requirement of adherence to the axioms in each theory: all specific models have to be consistent with these axioms. This is exactly what makes CP (and QP) models appealing to many theorists and why, as noted, in seeking to understand the unique features of QP theory, it is most natural to compare it with CP theory.

In sum, a central aspect of this article is the debate about whether psychologists should explore the utility of quantum theory in cognitive theory; or whether the existing formalisms are (mostly) adequate and a different paradigm is not necessary. Note that we do not develop an argument that CP theory is unsuitable for cognitive modeling; it clearly is, in many cases. And, moreover, as will be discussed, CP and QP processes sometimes converge in their predictions. Rather, what is at stake is whether there are situations in which the distinctive features of QP theory provide a more accurate and elegant explanation for empirical data. In the next section we provide a brief consideration of the basic mechanisms in QP theory. Perhaps contrary to common expectation, the relevant mathematics is simple and mostly based on geometry and linear algebra. We next consider empirical results that appear puzzling from the perspective of CP theory, but can naturally be accommodated within QP models. Finally, we discuss the implications of QP theory for understanding rationality.

2. Basic assumptions in QP theory and psychological motivation

2.1. The outcome space

CP theory is a set-theoretic way to assign probabilities to the possible outcomes of a question. First, a sample space is defined, in which specific outcomes about a question are subsets of this sample space. Then, a probability measure is postulated, which assigns probabilities to disjoint outcomes in an additive manner (Kolmogorov Reference Kolmogorov1933/1950). The formulation is different in QP theory, which is a geometric theory of assigning probabilities to outcomes (Isham Reference Isham1989). A vector space (called a Hilbert space) is defined, in which possible outcomes are represented as subspaces of this vector space. Note that our use of the terms questions and outcomes are meant to imply the technical QP terms observables and propositions.

A vector space represents all possible outcomes for questions we could ask about a system of interest. For example, consider a hypothetical person and the general question of that person's emotional state. Then, one-dimensional subspaces (called rays) in the vector space would correspond to the most elementary emotions possible. The number of unique elementary emotions and their relation to each other determine the overall dimensionality of the vector space. Also, more general emotions, such as happiness, would be represented by subspaces of higher dimensionality. In Figure 1a, we consider the question of whether a hypothetical person is happy or not. However, because it is hard to picture high multidimensional subspaces, for practical reasons we assume that the outcomes of the happiness question are one-dimensional subspaces. Therefore, one ray corresponds to the person definitely being happy and another one to that person definitely being unhappy.

Figure 1. An illustration of basic processes in QP theory. In Figure 1b, all vectors are co-planar, and the figure is a two-dimensional one. In Figure 1c, the three vectors “Happy, employed,” “Happy, unemployed,” and “Unhappy, employed” are all orthogonal to each other, so that the figure is a three-dimensional one. (The fourth dimension, “unhappy, unemployed” is not shown).

Our initial knowledge of the hypothetical person is indicated by the state vector, a unit length vector, denoted as |Ψ〉 (the bracket notation for a vector is called the Dirac notation). In psychological applications, it often refers to the state of mind, perhaps after reading some instructions for a psychological task. More formally, the state vector embodies all our current knowledge of the cognitive system under consideration. Using the simple vector space in Figure 1a, we can write |Ψ〉 = a|happy〉 + b|unhappy〉. Any vector |Ψ〉 can be expressed as a linear combination of the |happy〉 and |unhappy〉 vectors, so that these two vectors form a basis for the two-dimensional space we have employed. The a and b constants are called amplitudes and they reflect the components of the state vector along the different basis vectors.

To determine the probability of the answer happy, we need to project the state represented by |Ψ〉 onto the subspace for “happy” spanned by the vector |happy〉. This is done using what is called a projector, which takes the vector |Ψ〉 and lays it down on the subspace spanned by |happy〉; this projector can be denoted as P _happy. The projection to the |happy〉 subspace is denoted by P _happy |Ψ〉=a |happy〉. (Here and elsewhere we will slightly elaborate on some of the basic definitions in the Appendix.) Then, the probability that the person is happy is equal to the squared length of the projection, ||P _happy |Ψ〉||². That is, the probability that the person has a particular property depends upon the projection of |Ψ〉 onto the subspace corresponding to the property. In our simple example, this probability reduces to ||P _happy |Ψ〉||² = |a|², which is the squared magnitude of the amplitude of the state vector along the |happy〉 basis vector. The idea that projection can be employed in psychology to model the match between representations has been explored before (Sloman Reference Sloman1993), and the QP cognitive program can be seen as a way to generalize these early ideas. Also, note that a remarkable mathematical result, Gleason's theorem, shows that the QP way for assigning probabilities to subspaces is unique (e.g., Isham Reference Isham1989, p. 210). It is not possible to devise another scheme for assigning numbers to subspaces that satisfy the basic requirements for an additive probability measure (i.e., that the probabilities assigned to a set of mutually exclusive and exhaustive outcomes are individually between 0 and 1, and sum to 1).

An important feature of QP theory is the distinction between superposition and basis states. In the abovementioned example, after the person has decided that she is happy, then the state vector is |Ψ〉 = |happy〉; alternatively if she decides that she is unhappy, then |Ψ〉 = |unhappy〉. These are called basis states, with respect to the question about happiness, because the answer is certain when the state vector |Ψ〉 exactly coincides with one basis vector. Note that this explains why the subspaces corresponding to mutually exclusive outcomes (such as being happy and being unhappy) are at right angles to each other. If a person is definitely happy, i.e., |Ψ〉 = |happy〉, then we want a zero probability that the person is unhappy, which means a zero projection to the subspace for unhappy. This will only be the case if the happy, unhappy subspaces are orthogonal.

Before the decision, the state vector is a superposition of the two possibilities of happiness or unhappiness, so that |Ψ〉 = a|happy〉 + b|unhappy〉. The concept of superposition differs from the CP concept of a mixed state. According to the latter, the person is either exactly happy or exactly unhappy, but we don't know which, and so we assign some probability to each possibility. However, in QP theory, when a state vector is expressed as |Ψ〉 = a |happy〉 + b|unhappy〉 the person is neither happy nor unhappy. She is in an indefinite state regarding happiness, simultaneously entertaining both possibilities, but being uncommitted to either. In a superposition state, all we can talk about is the potential or tendency that the person will decide that she is happy or unhappy. Therefore, a decision, which causes a person to resolve the indefinite state regarding a question into a definite (basis) state, is not a simple read-out from a pre-existing definite state; instead, it is constructed from the current context and question (Aerts & Aerts Reference Aerts and Aerts1995). Note that other researchers have suggested that the way of exploring the available premises can affect the eventual judgment, as much as the premises themselves, so that judgment is a constructive process (e.g., Johnson et al. Reference Johnson, Haubl and Keinan2007; Shafer & Tversky Reference Shafer and Tversky1985). The interesting aspect of QP theory is that it fundamentally requires a constructive role for the process of disambiguating a superposition state (this relates to the Kochen–Specker theorem).

2.2. Compatibility

Suppose that we are interested in two questions, whether the person is happy or not, and also whether the person is employed or not. In this example, there are two outcomes with respect to the question about happiness, and two outcomes regarding employment. In CP theory, it is always possible to specify a single joint probability distribution over all four possible conjunctions of outcomes for happiness and employment, in a particular situation. (Griffiths [Reference Griffiths2003] calls this the unicity principle, and it is fundamental in CP theory). By contrast, in QP theory, there is a key distinction between compatible and incompatible questions. For compatible questions, one can specify a joint probability function for all outcome combinations and in such cases the predictions of CP and QP theories converge (ignoring dynamics). For incompatible questions, it is impossible to determine the outcomes of all questions concurrently. Being certain about the outcome of one question induces an indefinite state regarding the outcomes of other, incompatible questions.

This absolutely crucial property of incompatibility is one of the characteristics of QP theory that differentiates it from CP theory. Psychologically, incompatibility between questions means that a cognitive agent cannot formulate a single thought for combinations of the corresponding outcomes. This is perhaps because that agent is not used to thinking about these outcomes together, for example, as in the case of asking whether Linda (Tversky & Kahneman Reference Tversky and Kahneman1983) can be both a bank teller and a feminist. Incompatible questions need to be assessed one after the other. A heuristic guide of whether some questions should be considered compatible is whether clarifying one is expected to interfere with the evaluation of the other. Psychologically, the intuition is that considering one question alters our state of mind (the context), which in turn affects consideration of the second question. Therefore, probability assessment in QP theory can be (when we have incompatible questions) order and context dependent, which contrasts sharply with CP theory.

Whether some questions are considered compatible or incompatible is part of the analysis that specifies the corresponding cognitive model. Regarding the questions for happiness and employment for the hypothetical person, the modeler would need to commit a priori as to whether these are compatible or incompatible. We consider in turn the implications of each approach.

2.2.1. Incompatible questions

For outcomes corresponding to one-dimensional subspaces, incompatibility means that subspaces exist at nonorthogonal angles to each other, as in, for example, for the happy and employed subspaces in Figure 1b. Because of the simple relation we assume to exist between happiness and employment, all subspaces can be coplanar, so that the overall vector space is only two dimensional. Also, recall that certainty about a possible outcome in QP theory means that the state vector is contained within the subspace for the outcome. For example, if we are certain that the person is happy, then the state vector is aligned with the happy subspace. However, if this is the case, we can immediately see that we have to be somewhat uncertain about the person's employment (perhaps thinking about being happy makes the person a bit anxious about her job). Conversely, certainty about employment aligns the state vector with the subspace for employed, which makes the person somewhat uncertain about her happiness (perhaps her job is sometimes stressful). This is a manifestation of the famous Heisenberg uncertainty principle: Being clear on one question forces one to be unclear on another incompatible question.

Because it is impossible to evaluate incompatible questions concurrently, quantum conjunction has to be defined in a sequential way, and so order effects may arise in the overall judgment. For example, suppose that the person is asked first whether she is employed, and then whether she is happy, that is, we have

$$\eqalign{Prob\lpar employed \wedge then\ happy\rpar &= Prob\lpar employed\rpar \cr &\quad \cdot Prob\lpar happy \vert employed\rpar }$$

whereby the first term is

$$Prob\lpar employed\rpar = \Vert P_{employed} \vert \psi \rangle \Vert^{2}$$

The second term is the probability that the person is happy, given that the person is employed. Certainty that the person is employed means that the state vector is

$$\Vert \psi_{employed} \rangle = {p_{employed} \vert \psi \rangle \over \Vert p_{employed} \vert \psi \rangle \Vert}$$

Therefore

$$Prob\lpar happy \vert employed\rpar = \Vert P_{happy} \vert \psi_{employed} \rangle \Vert^{2}$$

which leads to

$$Prob\lpar employed \wedge then\ happy\rpar = \Vert P_{happy}P_{employed} \vert \psi \rangle \Vert^{2}$$

Therefore, in QP theory, a conjunction of incompatible questions involves projecting first to a subspace corresponding to an outcome for the first question and, second, to a subspace for the second question (Busemeyer et al. Reference Busemeyer, Pothos, Franco and Trueblood2011). This discussion also illustrates the QP definition for conditional probability, which is in general

$$\eqalign{Prob\lpar A\vert B\rpar &= {\Vert P_A P_B \vert \psi \rangle \Vert ^2 \over \Vert P_B \vert \psi \rangle \Vert^2} = {Prob\lpar B \wedge then\ A\rpar \over Prob\lpar B\rpar } \cr &\quad \lpar \hbox{this is called L}\ddot{\hbox{u}}\hbox{der}'\hbox{s law}\rpar.}$$

It is clear that the definition of conditional probability in QP theory is analogous to that in CP theory, but for potential order effects in the sequential projection P _A P _B, when A and B are incompatible.

The magnitude of a projection depends upon the angle between the corresponding subspaces. For example, when the angle is large, a lot of amplitude is lost between successive projections. As can be seen in Figure 1b,

$$\Vert P_{happy} \vert \psi \rangle \Vert^{2} \lt \Vert P_{happy} P_{employed} \vert \psi \rangle \Vert^{2}$$

that is, the direct projection to the happy subspace (green line) is less than the projection to the happy subspace via the employed one (light blue line). (Color versions of the figures in this article are available at http://dx.doi.org/10.1017/S0140525X12001525].) The psychological intuition would be that if the person is asked whether she is employed or not, and concludes that she is, perhaps this makes her feel particularly good about herself, which makes it more likely that she will say she is happy. In classical terms, here we have a situation whereby

$$Prob\lpar happy\rpar \lt Prob\lpar happy \wedge employed\rpar$$

which is impossible in CP theory. Moreover, consider the comparison between first asking “are you employed” and then “are you happy” versus first asking “are you happy” and then “are you employed.” In CP theory, this corresponds to

$$Prob\lpar employed \wedge happy\rpar = Prob\lpar happy \wedge employed\rpar .$$

However, in QP theory conjunction of incompatible questions fails commutativity. We have seen that

$$Prob\lpar employed \wedge then\ happy\rpar = \Vert P_{happy} P_{employed} \vert \psi \rangle \Vert^{2}$$

is large. By contrast,

$$Prob\lpar happy \wedge then\ employed\rpar = \Vert P_{employed} P_{happy} \vert \psi \rangle \Vert^{2}$$

is less large, because in this case we project from |Ψ〉 to |happy〉, whereby we lose quite a bit of amplitude (their relative angle is large) and then from |happy〉 to |employed〉 (we lose more amplitude).

In general, the smaller the angle between the subspaces for two incompatible outcomes, the greater the relation between the outcomes. A small angle is analogous to a high correlation in a classical framework. When there is a small angle, a sequential projection of the state vector from one subspace to the other loses little amplitude. Accordingly, accepting one outcome makes the other outcome very likely as well. The size of such angles and the relative dimensionality of the subspaces are the cornerstones of QP cognitive models and are determined by the known psychology of the problem. These angles (and the initial state vector) have a role in QP theory analogous to that of prior and conditional distributions in Bayesian modeling. In the toy illustration of Figure 1b, the only guidance in placing the subspaces is that the employed and happy subspaces should be near each other, to reflect the expectation that employment tends to relate to happiness. The state vector was placed near the employed subspace, assuming the person is confident in her employment.

Note that the above discussion does not concern probabilistic assessments indexed by time. That is, we are not comparing

$$Prob\lpar employed\ on\ Monday \wedge happy\ on\ Tuesday\rpar$$

versus

$$Prob\lpar happy\ on\ Monday \wedge employed\ on\ Tuesday\rpar .$$

Both CP and QP theories predict these to be different, because the events are distinguished by time, so we no longer compare the same events (“employed on Monday” is not the same event as “employed on Tuesday”). Rather, here we are concerned with the order of assessing a combination of two events, when the two events are defined in exactly the same way. But could order dependence in quantum theory arise as probability dependence in classical theory? The answer is no because

$$\eqalign{Prob\lpar A \wedge B\rpar &= Prob\lpar A\rpar Prob\lpar B\vert A\rpar = Prob\lpar B\rpar Prob\lpar A\vert B\rpar \cr &= Prob\lpar B \wedge A\rpar .}$$

In quantum theory, the intermediate step is not possible whenever P _A P _B ≠ P _B P _A . Note that in an expressions such as

$$Prob\lpar employed \wedge then\ happy\rpar = \Vert P_{happy} P_{employed} \vert \psi \rangle \Vert^{2}$$

there are two sources of uncertainty. There is the classical uncertainty about the various outcomes. There is a further uncertainty as to how the state will collapse after the first question (if the two questions are incompatible). This second source of uncertainty does not exist in a classical framework, as classically it is assumed that a measurement (or evaluation) simply reads off existing values. By contrast, in quantum theory a measurement can create a definite value for a system, which did not previously exist (if the state of the system was a superposition one).

We have seen how it is possible in QP theory to have definite knowledge of one outcome affect the likelihood of an alternative, incompatible outcome. Order and context dependence of probability assessments (and, relatedly, the failure of commutativity in conjunction) are some of the most distinctive and powerful features of QP theory. Moreover, the definitions for conjunction and conditional probability in QP theory are entirely analogous to those in CP theory, except for the potential of order effects for incompatible questions.

2.2.2. Compatible questions

Now assume that the happiness and employment questions are compatible, which means that considering one does not influence consideration of the other, and all four possible conjunctions of outcomes are defined. To accommodate these outcome combinations, we need a four-dimensional space, in which each basis vector corresponds to a particular combination of happiness and employment outcomes (Figure 1c is a three-dimensional simplification of this space, leaving out the fourth dimension). Then, the probability that the person is happy and employed is given by projecting the state vector onto the corresponding basis vector. Clearly,

$$\eqalign{Prob\lpar happy \wedge employed\rpar &= \Vert P_{happy\,\wedge\, employed} \vert \psi \rangle \Vert^{2} \cr &= Prob\lpar employed \wedge happy\rpar .}$$

Thus, for compatible questions, conjunction is commutative, as in CP theory.

The vector space for compatible outcomes is formed by an operation called a tensor product, which provides a way to construct a composite space out of simpler spaces. For example, regarding happiness we can write

$$\vert H \rangle = h\cdot \vert happy \rangle + h^{\prime} \cdot \vert\!\! \sim\! happy \rangle$$

and this state vector allows us to compute the probability that the person is happy or not. Likewise, regarding employment, we can write

$$\vert E \rangle = e \cdot \vert employed \rangle + e^{\prime} \cdot \vert\!\! \sim\! employed \rangle.$$

As long as happiness and employment are compatible, the tensor product between |H〉 and |E〉 is given by

$$\eqalign{&\vert product\ state\rangle = \vert H\rangle \otimes \vert E\rangle \cr &=h \cdot e \cdot \vert happy\rangle \otimes \vert employed\rangle + h \cdot e^{\prime} \cdot \vert happy\rangle \cr &\quad \otimes \vert\!\! \sim\! employed\rangle + h^{\prime} \cdot e \cdot \vert\!\! \sim\! happy\rangle \otimes \vert employed \rangle \cr & \quad+ h^{\prime} \cdot e^{\prime} \cdot \vert\!\! \sim\! happy\rangle \otimes \vert\!\! \sim\! employed\rangle.}$$

This four-dimensional product state is formed from the basis vectors representing all possible combinations of whether the person is employed or not and is happy or not. For example, $\vert happy\rangle \otimes \vert employed \rangle \vert$ or for brevity |happy〉|employed〉, denotes a single basis vector that represents the occurrence of the conjunction “happy and employed” (Figure 1c). The joint probability that the person is employed and happy simply equals |h·e|². This probability agrees with the classical result for Prob(employed ∧ happy), in the sense that the QP conjunction is interpreted (and has the same properties) as conjunction in CP theory.

What are the implications for psychological modeling? Tensor product representations provide a concrete and rigorous way of creating structured spatial representations in QP theory. Several researchers have pointed out that representations for even the most basic concepts must be structured, as information about the different elements of a concept are compared to like (alignable) elements in an alternative concept (Goldstone Reference Goldstone1994; Hahn et al. Reference Hahn, Chater and Richardson2003; Markman & Gentner Reference Markman and Gentner1993). Such intuitions can be readily realized in a QP framework through tensor product representations. Note that this idea is not new: others have sought to develop structured representations via tensor products (Smolensky Reference Smolensky1990). The advantage of QP theory is that a tensor product representation is supported by a framework for assessing probabilities.

CP theory is also consistent with structured representations. However, in QP theory, because of the property of superposition, creating structured representations sometimes leads to a situation of entanglement. Entanglement relates to some of the most puzzling properties of QP theory. To explain it, we start from a state that is not entangled, the |product state〉 described earlier, and assume that the person is definitely employed (e=1), so that the state reduces to

$$\eqalign{\vert reduced\ state \rangle &= h\cdot \vert happy \rangle \vert employed \rangle \cr &\quad + h^{\prime}\cdot \vert {\sim}happy \rangle \vert employed \rangle.}$$

So far, we can see how the part for being happy is completely separate from the part for being employed. It should be clear that in such a simple case, the probability of being happy is independent (can be decomposed from) the probability of being employed. As long as the state vector has a product form (e.g., as mentioned), the components for each subsystem can be separated out. This situation is entirely analogous to that in CP theory for independent events, whereby a composite system can always be decomposed into the product of its separate subsystems.

An entangled state is one for which it is not possible to write the state vector as a tensor product between two vectors. Suppose we have

$$\eqalign{\vert entangled\ state \rangle &= x\cdot \vert happy \rangle \vert employed \rangle \cr &\quad + w\cdot \vert {\sim} happy \rangle \vert {\sim} employed \rangle.}$$

This |entangled state〉 does not correspond to either a decision being made regarding being happy or a clarification regarding employment. Such states are called entangled states, because an operation that influences one part of the system (e.g., being happy), inexorably affects the other (clarifying employment). In other words, in such an entangled state, the possibilities of being happy and employed are strongly dependent upon each other. The significance of entanglement is that it can lead to an extreme form of dependency between the outcomes for a pair of questions, which goes beyond what is possible in CP theory. In classical theory, one can always construct a joint probability Prob(A,B,C) out of pairwise ones, and Prob(A,B), Prob(A,C), and Prob(B,C) are all constrained by this joint. However, in QP theory, for entangled systems, it is not possible to construct a complete joint, because the pairwise probabilities can be stronger than what is allowed classically (Fine Reference Fine1982).

2.3. Time evolution

So far, we have seen static QP models, whereby we assess the probability for various outcomes for a state at a single point in time. We next examine how the state can change in time. Time evolution in QP theory involves a rotation (technically, a unitary) operator (the solution to Schrödinger's equation). This dynamic operator evolves the initial state vector, without changing its magnitude. It is important to recall that the state vector is a superposition of components along different basis vectors. Therefore, what evolves are the amplitudes along the different basis vectors. For example, a rotation operator might move the state |Ψ〉 away from the |happy〉 basis vector toward the |unhappy〉 one, if the modeled psychological process causes unhappiness with time. Analogously, time evolution in CP theory involves a transition matrix (the solution to Kolmogorov's forward equation). The classical initial state corresponds to a joint probability distribution over all combinations of outcomes. Time evolution involves a transformation of these probabilities, without violating the law of total probability.

In both CP and QP theories, time evolution corresponds to a linear transformation of the initial state. In CP theory, the time-evolved state directly gives the probabilities for the possible outcomes. Time evolution is a linear transformation that preserves the law of total probability. By contrast, in QP theory, whereas the state vector amplitudes are linearly transformed, probabilities are obtained by squaring the length of the state vector. This nonlinearity means that the probabilities obtained from the initial state vector may obey the law of total probability, but this does not have to be the case for the time-evolved ones. Therefore, in QP theory, time evolution can produce probabilities that violate the law of total probability. This is a critical difference between CP and QP theory and argues in favor of the latter, to the extent that there are cognitive violations of the law of total probability.

As an example, suppose the hypothetical person is due a major professional review and she is a bit anxious about continued employment (so that she is unsure about whether she is employed or not). Prior to the review, she contemplates whether she is happy to be employed or not. In this example, we assume that the employment and happiness questions are compatible (Figure 1c). In CP theory, the initial probabilities satisfy

$$\eqalign{Prob\lpar happy\comma \; unknown\ empl.\rpar &= Prob\lpar happy \wedge employed\rpar \cr &\quad + Prob\lpar happy \wedge not\ employed\rpar .}$$

Next, assume that the state vector evolves for time t. This process of evolution could correspond, for example, to the thought process of considering happiness, depending upon employment assumptions. It would lead to a final set of probabilities that satisfy

$$\eqalign{&Prob\lpar happy\comma \; unknown\ empl.\comma \; at\ t\rpar \cr &= Prob \lpar happy\ at\ t \wedge employed\rpar \cr &\quad + Prob\lpar happy\ at\ t \wedge not employed\rpar }$$

Although the final distribution differs from the initial distribution, they both obey the law of total probability. In QP theory, we can write the initial state vector as

$$\eqalign{State\lpar happy\comma \; unknown\ empl.\rpar &= State\lpar happy \wedge employed\rpar \cr &\quad + \lpar happy \wedge not\ employed\rpar .}$$

After time evolution, we have

$$\eqalign{&State \lpar happy\comma \; unknown empl.\comma \; at\ t\rpar \cr &= State \lpar happy\ at\ t \wedge employed\rpar \cr &\quad +State\lpar happy\ at\ t \wedge not\ employed\rpar }$$

but

$$\eqalign{&Prob \lpar happy\comma \; unknown\ empl.\comma \; at\ t\rpar \cr &= Prob\lpar happy\ at\ t \wedge employed\rpar \cr &\quad + Prob\lpar happy\ at\ t \wedge not\ employed\rpar \cr &\quad + Interference \lpar crossproduct\rpar\ terms}$$

(see Appendix). One way in which interference effects can arise in QP theory is by starting with a state vector that is a superposition of orthogonal states. Then, time evolution can result in the state vector being a superposition of states, which are no longer orthogonal. As quantum probabilities are determined from the state vector by squaring its length, we have a situation analogous to |a + b|² = a ² + b ² + a*b + b*a. When the states corresponding to a, b are orthogonal, the interference terms a*b + b*a disappear and QP theory reduces to CP theory. Otherwise, QP theory can produce violations of the law of total probability.

Interference terms can be positive or negative and their particular form will depend upon the specifics of the corresponding model. In the previous example, negative interference terms could mean that the person may think she would be happy if it turns out she is employed (perhaps because of the extra money) or that she would be happy if she loses her job (perhaps she doesn't like the work). However, when she is unsure about her employment, she becomes unhappy. It is as if these two individually good reasons for being happy cancel each other out (Busemeyer & Bruza Reference Busemeyer and Bruza2012, Ch. 9). That a preference that is dominant under any single definite condition can be reversed in an unknown condition is a remarkable feature of QP theory and one that (as will be discussed) corresponds well to intuition about psychological process (Tversky & Shafir Reference Tversky and Shafir1992).

Suppose that the hypothetical person knows she will find out whether she will be employed or not, before having the inner reflection about happiness (perhaps she plans to think about her happiness after a professional review). The resolution regarding employment eliminates any possible interference effects from her judgment, and the quantum prediction converges to the classical one (Appendix). Therefore, in QP theory, there is a crucial difference between (just) uncertainty and superposition and it is only the latter that can lead to violations of the law of total probability. In quantum theory, just the knowledge that an uncertain situation has been resolved (without necessarily knowing the outcome of the resolution) can have a profound influence on predictions.

3. The empirical case for QP theory in psychology

In this section, we explore whether the main characteristics of QP theory (order/context effects, interference, superposition, entanglement) provide us with any advantage in understanding psychological processes. Many of these situations concern Kahneman and Tversky's hugely influential research program on heuristics and biases (Kahneman et al. Reference Kahneman, Slovic and Tversky1982; Tversky & Kahneman Reference Tversky and Kahneman1973; Reference Tversky and Kahneman1974; Reference Tversky and Kahneman1983), one of the few psychology research programs to have been associated with a Nobel prize (in economics, for Kahneman in 2002). This research program was built around compelling demonstrations that key aspects of CP theory are often violated in decision making and judgment. Therefore, this is a natural place to start looking for whether QP theory may have an advantage over CP theory.

Our strategy is to first discuss how the empirical finding in question is inconsistent with CP theory axioms. This is not to say that some model broadly based on classical principles cannot be formulated. Rather, that the basic empirical finding is clearly inconsistent with classical principles and that a classical formalism, when it exists, may be contrived. We then present an illustration for how a QP approach can offer the required empirical coverage. Such illustrations will be simplifications of the corresponding quantum models.

3.1. Conjunction fallacy

In a famous demonstration, Tversky and Kahneman (Reference Tversky and Kahneman1983) presented participants with a story about a hypothetical person, Linda, who sounded very much like a feminist. Participants were then asked to evaluate the probability of statements about Linda. The important comparison concerned the statements “Linda is a bank teller” (extremely unlikely given Linda's description) and “Linda is a bank teller and a feminist.” Most participants chose the second statement as more likely than the first, thus effectively judging that

$$Prob\lpar bank\ teller\rpar \lt Prob\lpar bank\ teller \wedge feminist\rpar .$$

This critical empirical finding is obtained with different kinds of stories or dependent measures (including betting procedures that do not rely on the concept of probability; Gavanski & Roskos-Ewoldsen Reference Gavanski and Roskos-Ewoldsen1991; Sides et al. Reference Sides, Osherson, Bonini and Viale2002; Stolarz-Fantino et al. Reference Stolarz-Fantino, Fantino, Zizzo and Wen2003; Tentori & Crupi Reference Tentori and Crupi2012; Wedell & Moro Reference Wedell and Moro2008). However, according to CP theory this is impossible, because the conjunction of two statements can never be more probable than either statement individually (this finding is referred to as the conjunction fallacy). The CP intuition can be readily appreciated in frequentist terms: in a sample space of all possible Lindas, of the ones who are bank tellers, only a subset will be both bank tellers and feminists. Tversky and Kahneman's explanation was that (classical) probability theory is not appropriate for understanding such judgments. Rather, such processes are driven by a similarity mechanism, specifically a representativeness heuristic, according to which participants prefer the statement “Linda is a bank teller and a feminist” because Linda is more representative of a stereotypical feminist. A related explanation, based on the availability heuristic, is that the conjunctive statement activates memory instances similar to Linda (Tversky & Koehler Reference Tversky and Koehler1994).

QP theory provides an alternative way to understand the conjunction fallacy. In Figure 2, we specify |Ψ〉, the initial state vector, to be very near the basis vector for |feminist〉 and nearly orthogonal to the basis vector for |bank teller〉. Also, the |feminist〉 basis vector is neither particularly close nor particularly far away from the |bank teller〉 one, because to be a bank teller is not perhaps the most likely profession for feminists, but it is not entirely unlikely either. These are our priors for the problem, that is, that the description of Linda makes it very likely that she is a feminist and very unlikely that she is a bank teller. Note the limited flexibility in the specification of these subspaces and the state vector. For example, the state vector could not be placed in between the bank teller and feminist subspaces, as this would mean that it is has a high projection to both the bank teller and the feminist outcomes (only the latter is true). Likewise, it would make no sense to place the feminist subspace near the bank teller one, or to the not bank teller one, as feminism is a property that is largely uninformative as to whether a person is a bank teller or not.

Figure 2. An illustration of the QP explanation for the conjunction fallacy.

Consider the conjunctive statement “Linda is a bank teller and a feminist.” As we have seen, in QP theory, conjunctions are evaluated as sequences of projections. An additional assumption is made that in situations such as this, the more probable possible outcome is evaluated first (this is a reasonable assumption, as it implies that more probable outcomes are prioritized in the decision making process; cf. Gigerenzer & Todd Reference Gigerenzer and Todd1999). Therefore, the conjunctive statement involves first projecting onto the feminist basis vector, and subsequently projecting on the bank teller one. It is immediately clear that this sequence of projections leads to a larger overall amplitude (green line), compared to the direct projection from |Ψ〉 onto the bank teller vector.

Psychologically, the QP model explains the conjunction fallacy in terms of the context dependence of probability assessment. Given the information participants receive about Linda, it is extremely unlikely that she is a bank teller. However, once participants think of Linda in more general terms as a feminist, they are more able to appreciate that feminists can have all sorts of professions, including being bank tellers. The projection acts as a kind of abstraction process, so that the projection onto the feminist subspace loses some of the details about Linda, which previously made it impossible to think of her as a bank teller. From the more abstract feminist point of view, it becomes a bit more likely that Linda could be a bank teller, so that whereas the probability of the conjunction remains low, it is still more likely than the probability for just the bank teller property. Of course, from a QP theory perspective, the conjunctive fallacy is no longer a fallacy, it arises naturally from basic QP axioms.

Busemeyer et al. (Reference Busemeyer, Pothos, Franco and Trueblood2011) presented a quantum model based on this idea and examined in detail the requirements for the model to predict an overestimation of conjunction. In general, QP theory does not always predict an overestimation of conjunction. However, given the details of the Linda problem, an overestimation of conjunction necessarily follows. Moreover, the same model was able to account for several related empirical findings, such as the disjunction fallacy, event dependencies, order effects, and unpacking effects (e.g., Bar-Hillel & Neter Reference Bar-Hillel and Neter1993; Carlson & Yates Reference Carlson and Yates1989; Gavanski & Roskos-Ewoldsen Reference Gavanski and Roskos-Ewoldsen1991; Stolarz-Fantino, et al. Reference Stolarz-Fantino, Fantino, Zizzo and Wen2003). Also, the QP model is compatible with the representativeness and availability heuristics. The projection operations used to compute probabilities measure the degree of overlap between two vectors (or subspaces), and overlap is a measure of similarity (Sloman Reference Sloman1993). Thus, perceiving Linda as a feminist allows the cognitive system to establish similarities between the initial representation (the initial information about Linda) and the representation for bank tellers. If we consider representativeness to be a similarity process, as we can do with the QP model, it is not surprising that it is subject to chaining and context effects. Moreover, regarding the availability heuristic (Tversky & Koehler Reference Tversky and Koehler1994), the perspective from the QP model is that considering Linda to be a feminist increases availability for other related information about feminism, such as possible professions.

3.2. Failures of commutativity in decision making

We next consider failures of commutativity in decision making, whereby asking the same two questions in different orders can lead to changes in response (Feldman & Lynch Reference Feldman and Lynch1988; Schuman & Presser Reference Schuman and Presser1981; Tourangeau et al. Reference Tourangeau, Rips and Rasinski2000). Consider the questions “Is Clinton honest?” and “Is Gore honest?” and the same questions in a reverse order. When the first two questions were asked in a Gallup poll, the probabilities of answering yes for Clinton and Gore were 50% and 68%, respectively. The corresponding probabilities for asking the questions in the reverse order were, by contrast, 57% and 60% (Moore Reference Moore2002). Such order effects are puzzling according to CP theory, because, as noted, the probability of saying yes to question A and then yes to question B equals

$$\eqalign{Prob\lpar A\rpar \cdot Prob\lpar B \vert A\rpar &= Prob\lpar A \wedge B\rpar = Prob\lpar B \wedge A\rpar \cr &= Prob \lpar B\rpar \cdot Prob \lpar A \vert B\rpar .}$$

Therefore, CP theory predicts that the order of asking two questions does not matter. By contrast, the explanation for order effects in social psychology is that the first question activates thoughts, which subsequently affect consideration of the second question (Schwarz Reference Schwarz2007).

QP theory can accommodate order effects in Gallup polls, in a way analogous to how the conjunction fallacy is explained. In both cases, the idea is that the context for assessing the first question influences the assessment of any subsequent questions. Figure 3 is analogous to Figure 2. In Figure 3, there are two sets of basis vectors, one for evaluating whether Clinton is honest or not and another for evaluating whether Gore is honest or not. The two sets of basis vectors are not entirely orthogonal; we assume that if a person considers Clinton honest, then that person is a little more likely to consider Gore to be honest as well, and vice versa (as they ran for office together). The initial state vector is fairly close to the |Gore yes〉 vector, but less close to the |Clinton yes〉 basis vector, to reflect the information that Gore would be considered more honest than Clinton. The length of the projection onto the |Clinton yes〉 basis vector reflects the probability that Clinton is honest. It can be seen that the direct projection is less, compared to the projection via the |Gore yes〉 vector. In other words, deciding that Gore is honest increases the probability that Clinton is judged to be honest as well (and, conversely, deciding that Clinton is honest first, reduces the probability that Gore is judged as honest).

Figure 3. An illustration of order effects in Gallup polls.

The actual QP theory model developed for such failures in commutativity was based on the abovementioned idea, but was more general, so as to provide a parameter free test of the relevant empirical data (e.g., there are various specific types of order effects; Wang & Busemeyer, in press).

A related failure of commutativity concerns the order of assessing different pieces of evidence for a particular hypothesis. According to CP theory, the order in which evidence A and B is considered, in relation to a hypothesis H, is irrelevant, as

$$Prob\lpar H \vert A \wedge B\rpar = Prob \lpar H \vert B \wedge A\rpar .$$

However, there have been demonstrations that, in fact,

$$Prob\lpar H \vert A \wedge B\rpar \ne Prob \lpar H \vert B\wedge A\rpar$$

(Hogarth & Einhorn Reference Hogarth and Einhorn1992; Shanteau Reference Shanteau1970; Walker et al. Reference Walker, Thibaut and Andreoli1972). Trueblood and Busemeyer (Reference Trueblood and Busemeyer2011) proposed a QP model for two such situations, a jury decision-making task (McKenzie et al. Reference McKenzie, Lee and Chen2002) and a medical inference one (Bergus et al. Reference Bergus, Chapman, Levy, Ely and Oppliger1998). For example, in the medical task participants (all medical practitioners) had to make a decision about a disease based on two types of clinical information. The order of presenting this information influenced the decision, with results suggesting that the information presented last was weighted more heavily (a recency effect). Trueblood and Busemeyer's (Reference Trueblood and Busemeyer2011) model involved considering a tensor product space for the state vector, with one space corresponding to the presence or absence of the disease (this is the event we are ultimately interested in) and the other space to positive or negative evidence, evaluated with respect to the two different sources of information (one source of information implies positive evidence for the disease and the other negative evidence). Considering each source of clinical information involved a rotation of the state vector, in a way reflecting the impact of the information on the disease hypothesis. The exact degree of rotation was determined by free parameters. Using the same number of parameters, the QP theory model produced better fits to empirical results than the anchoring and adjustment model of Hogarth and Einhorn (Reference Hogarth and Einhorn1992) for the medical diagnosis problem and for the related jury decision one.

3.3. Violations of the sure thing principle

The model Trueblood and Busemeyer (Reference Trueblood and Busemeyer2011) developed is an example of a dynamic QP model, whereby the inference process requires evolution of the state vector. This same kind of model has been employed by Pothos and Busemeyer (Reference Pothos and Busemeyer2009) and Busemeyer et al. (Reference Busemeyer, Wang and Lambert-Mogiliansky2009) to account for violations of the sure thing principle. The sure thing principle is the expectation that human behavior ought to conform to the law of total probability. For example, in a famous demonstration, Shafir and Tversky (Reference Shafir and Tversky1992) reported that participants violated the sure thing principle in a one-shot prisoner's dilemma task. This is a task whereby participants receive different payoffs depending upon whether they decide to cooperate or defect, relative to another (often hypothetical) opponent. Usually the player does not know the opponents' move, but in some conditions Shafir and Tversky told participants what the opponent had decided to do. When participants were told that the opponent was going to cooperate, they decided to defect; and when they were told that the opponent was defecting, they decided to defect as well. The payoffs were specified in such a way so that defection was the optimal strategy. The expectation from the sure thing principle is that, when no information was provided about the action of the opponent, participants should also decide to defect (it is a “sure thing” that defection is the best strategy, because it is the best strategy in all particular cases of opponent's actions). However, surprisingly, in the “no knowledge” case, many participants reversed their judgment and decided to cooperate (Busemeyer et al. Reference Busemeyer, Matthew, Wang, Sun and Miyake2006a; Croson Reference Croson1999; Li & Taplin Reference Li and Taplin2002). Similar results have been reported for the two-stage gambling task (Tversky & Shafir Reference Tversky and Shafir1992) and a novel categorization–decision-making paradigm (Busemeyer et al. Reference Busemeyer, Wang and Lambert-Mogiliansky2009; Townsend et al. Reference Townsend, Silva, Spencer-Smith and Wenger2000). Therefore, violations of the sure thing principle in decision making, although relatively infrequent, are not exactly rare either. Note that this research has established violations of the sure thing principle using within-participants designs.

Shafir and Tversky (Reference Shafir and Tversky1992) suggested that participants perhaps adjust their beliefs for the other player's action, depending upon what they are intending to do (this principle was called wishful thinking and follows from cognitive dissonance theory and related hypotheses, e.g., Festinger Reference Festinger1957; Krueger et al. Reference Krueger, DiDonato and Freestone2012). Therefore, if there is a slight bias for cooperative behavior, in the unknown condition participants might be deciding to cooperate because they imagine that the opponent would cooperate as well. Tversky and Shafir (Reference Tversky and Shafir1992) described such violations of the sure thing principle as failures of consequential reasoning. When participants are told that the opponent is going to defect, they have a good reason to defect as well, and, likewise, when they are told that the opponent is going to cooperate. However, in the unknown condition, it is as if these (separate) good reasons for defecting under each known condition cancel each other out (Busemeyer & Bruza 2011, Ch. 9).

This situation is similar to the generic example for violations of the law of total probability that we considered in Section 2. Pothos and Busemeyer (Reference Pothos and Busemeyer2009) developed a quantum model for the two-stage gambling task and prisoner's dilemma embodying these simple ideas. A state vector was defined in a tensor product space of two spaces, one corresponding to the participant's intention to cooperate or defect and one for the belief of whether the opponent is cooperating or defecting. A unitary operator was then specified to rotate the state vector depending on the payoffs, increasing the amplitudes for those combinations of action and belief maximizing payoff. The same unitary operator also embodied the idea of wishful thinking, rotating the state vector so that the amplitudes for the “cooperate–cooperate” and “defect–defect” combinations for participant and opponent actions increased. Thus, the state vector developed as a result of two influences. The final probabilities for whether the participant is expected to cooperate or defect were computed from the evolved state vector, by squaring the magnitudes of the relevant amplitudes.

Specifically, the probability of defecting when the opponent is known to defect is based on the projection P _{participant to D} |Ψ _{opponent known D}〉, where P _{participant to D} is a projection operator corresponding to the participant choosing to defect. Similarly, the probability of defecting when the opponent is known to cooperate is based on the projection P _{participant to D} |Ψ _{opponent known C}〉. But, in the unknown case, the relevant state vector is the superposition ${1 \over \sqrt{2}}\vert \psi_{opponent\ known\ D} \rangle + {1 \over \sqrt{2}} \vert \psi_{opponent\ known\ C} \rangle $ . The probability for the participant to defect is computed by first using the operator P _{participant to D} on this superposition, which gives us P _{participant to D} (|Ψ _{opponent known D}〉 +|Ψ _{opponent known C}〉), and subsequently squaring the length of the resulting projection. Therefore, we have another case of | a + b|² = a ² + b ² + a*b + b*a, with non-zero interference terms. Thus, a high probability to defect in the two known conditions (high a ² and high b ²) can be offset by negative interference terms, which means a lower probability to defect in the unknown condition. We can interpret these computations in terms of Tversky and Shafir's (Reference Tversky and Shafir1992) description of the result as a failure of consequential reasoning. Moreover, the QP model provides a formalization of the wishful thinking hypothesis, with the specification of a corresponding unitary operator matrix. However, note that this quantum model is more complex than the ones considered previously. It requires more detail to see how interference arises, in a way that leads to the required result, and the model involves two parameters (model predictions are robust across a wide range of parameter space).

3.4. Asymmetry in similarity

We have considered how the QP explanation for the conjunction fallacy can be seen as a formalization of the representativeness heuristic (Tversky & Kahneman Reference Tversky and Kahneman1983). This raises the possibility that the QP machinery could be employed for modeling similarity judgments. In one of the most influential demonstrations in the similarity literature, Tversky (Reference Tversky1977) showed that similarity judgments violate all metric axioms. For example, in some cases, the similarity of A to B would not be the same as the similarity of B to A. Tversky's (Reference Tversky1977) findings profoundly challenged the predominant approach to similarity, whereby objects are represented as points in a multidimensional space, and similarity is modeled as a function of distance. Since then, novel proposals for similarity have been primarily assessed in terms of how well they can cover Tversky's (Reference Tversky1977) key empirical results (Ashby & Perrin Reference Ashby and Perrin1988; Krumhansl Reference Krumhansl1978).

Pothos and Busemeyer (Reference Pothos and Busemeyer2011) proposed that different concepts in our experience correspond to subspaces of different dimensionality, so that concepts for which there is more extensive knowledge were naturally associated with subspaces of greater dimensionality. Individual dimensions can be broadly understood as concept properties. They suggested that the similarity of a concept A to another concept B (denoted, Sim (A,B)) could be modeled with the projection from the subspace for the first concept to the subspace for the second one: Sim (A,B) = ||P _B · P _A · Ψ|| ²= Prob(A ∧ then B). Because in QP theory probability is computed from the overlap between a vector and a subspace, it is naturally interpreted as similarity (Sloman Reference Sloman1993). The initial state vector corresponds to whatever a person would be thinking just prior to the comparison. This is set so that it is neutral with respect to the A and B subspaces (i.e., prior to the similarity comparison, a participant would not be thinking more about A than about B, or vice versa).

Consider one of Tversky's (Reference Tversky1977) main findings, that the similarity of Korea to China was judged greater than the similarity of China to Korea (actually, North Korea and communist China; similar asymmetries were reported for other countries). Tversky's proposal was that symmetry is violated, because we have more extensive knowledge about China than about Korea, and, therefore, China has more distinctive features relative to Korea. He was able to describe empirical results with a similarity model based on a differential weighting of the common and distinctive features of Korea and China. However, the only way to specify these weights,was with free parameters and alternative values for the weights, could lead to either no violation of symmetry or a violation in a way opposite to the empirically observed one.

By contrast, using QP theory, if one simply assumes that the dimensionality of the China subspace is greater than the dimensionality of the Korea one, then a violation of symmetry in the required direction readily emerges, without the need for parameter manipulation. As shown in Figure 4, in the Korea to China comparison (4a), the last projection is to a higher dimensionality subspace than is the last projection in the China to Korea comparison (4b). Therefore, in the Korea to China case (4a), more of the amplitude of the original state vector is retained, which leads to a prediction for a higher similarity judgment. This intuition was validated with computational simulations by Pothos and Busemeyer (Reference Pothos and Busemeyer2011), whose results indicate that, as long as one subspace has a greater dimensionality than another, on average the transition from the lower dimensionality subspace to the higher dimensionality one would retain more amplitude than the converse transition (it has not been proved that this is always the case, but note that participant results with such tasks are not uniform).

Figure 4. Figure 4a corresponds to the similarity of Korea to China and 4b to the similarity of China to Korea. Projecting to a higher dimensionality subspace last (as in 4a) retains more of the original amplitude than projecting onto a lower dimensionality subspace last (as in 4b).

3.5. Other related empirical evidence

Tversky and Kahneman are perhaps the researchers who most vocally pointed out a disconnect between CP models and cognitive process and, accordingly, we have emphasized QP theory models for some of their most influential findings (and related findings). A skeptical reader may ask, is the applicability of QP theory to cognition mostly restricted to decision making and judgment? Empirical findings that indicate an inconsistency with CP principles are widespread across most areas of cognition. Such findings are perhaps not as well established as the ones reviewed previously, but they do provide encouragement regarding the potential of QP theory in psychology. We have just considered a QP theory model for asymmetries in similarity judgment. Relatedly, Hampton (Reference Hampton1988b, Hampton Reference Hampton1988 see also Hampton Reference Hampton1988a) reported an overextension effect for category membership. Participants rated the strength of category membership of a particular instance to different categories. For example, the rated membership of “cuckoo” to the pet and bird categories were 0.575 and 1 respectively. However, the corresponding rating for the conjunctive category pet bird was 0.842, a finding analogous to the conjunction fallacy. This paradigm also produces violations of disjunction. Aerts and Gabora (Reference Aerts and Gabora2005b) and Aerts (Reference Aerts2009) provided a QP theory account of such findings. Relatedly, Aerts and Sozzo (Reference Aerts and Sozzo2011b) examined membership judgments for pairs of concept combinations, and they empirically found extreme forms of dependencies between concept combination pairs, which indicated that it would be impossible to specify a complete joint distribution over all combinations. These results could be predicted by a QP model using entangled states to represent concept pairs.

In memory research, Brainerd and Reyna (Reference Brainerd and Reyna2008) discovered an episodic overdistribution effect. In a training part, participants were asked to study a set of items T. In test, the training items T were presented together with related new ones, R (and some additional foil items). Two sets of instructions were employed. With the verbatim instructions (V), participants were asked to identify only items from the set T. With the gist instructions (G), participants were required to select only R items. In some cases, the instructions (denoted as V or G) prompted participants to select test items from the T or R sets. From a classical perspective, as a test item comes from either the T set or the R one, but not both, it has to be the case that Prob(V|T) + Prob(G|T)= Prob(VorG|T) (these are the probabilities of endorsing a test item from the set T, as a function of different instructions). However, Brainerd and Reyna's (Reference Brainerd and Reyna2008) empirical results were inconsistent with the classical prediction. Busemeyer and Bruza (Reference Busemeyer and Bruza2012, Ch. 6) explored in detail a range of models for this memory overdistribution effect (apart from a CP theory model, also a signal detection model, Brainerd et al.'s [Reference Brainerd, Reyna and Mojardin1999] dual process model, and a QP theory model). The best performing models were the quantum model and the dual process one, but the ability of the latter to cover empirical results, in this case, perhaps depended too much on an arbitrary bias parameter. Another example from memory research is Bruza et. al.'s (Reference Bruza, Kitto, Nelson and McEvoy2009) application of quantum entanglement (which implies a kind of holism inconsistent with classical notions of causality) to explain associative memory findings, which cannot be accommodated within the popular theory of spreading activation.

Finally, in perception, Conte et al. (Reference Conte, Khrennikov, Todarello, Federici, Mendolicchio and Zbilut2009) employed a paradigm involving the sequential presentation of two ambiguous figures (each figure could be perceived in two different ways) or the presentation of only one of the figures. It is possible that seeing one figure first may result in some bias in perceiving the second figure. Nonetheless, from a classical perspective, one still expects the law of total probability to be obeyed, so that p(A + ∧ B–) + p(A + ∧ B+) = p(A+) (A and B refer to the two figures and the+and – signs to the two possible ways of perceiving them). It turned out that empirical results were inconsistent with the law of total probability, but a QP model could provide satisfactory coverage. In other perception work, Atmanspacher et al. (Reference Atmanspacher, Filk and Romer2004; Atmanspacher & Filk Reference Atmanspacher and Filk2010) developed and empirically tested a quantum model that could predict the dynamic changes produced during bistable perception. Their model provided a picture of the underlying cognitive process radically different from the classical one. Classically, it has to be assumed that at any given time a bistable stimulus is perceived with a particular interpretation. In Atmanspacher et al.'s (Reference Atmanspacher2004) model, by contrast, time periods of perception definiteness were intermixed with periods in which the perceptual impact from the stimulus was described with a superposition state, making it impossible to consider it as conforming to a particular interpretation. Atmanspacher et al.'s (Reference Atmanspacher2004) model thus predicted violations of causality in temporal continuity.

4. General issues for the QP models

4.1 Can the psychological relevance of CP theory be disproved?

It is always possible to augment a model with additional parameters or mechanisms to accommodate problematic results. For example, a classical model could describe the conjunction fallacy in the Linda story by basing judgment not on the difference between a conjunction and an individual probability, but rather on the difference between appropriately set conditional probabilities (e.g., Prob(Linda|bank teller) vs. Prob(Linda|bank teller ∧ feminist); cf. Tenenbaum & Griffiths Reference Tenenbaum and Griffiths2001). Also, a conjunctive statement can always be conditionalized on presentation order, so that one can incorporate the assumption that the last piece of evidence is weighted more heavily than the first piece. Moreover, deviations from CP predictions in judgment could be explained by introducing assumptions of how participants interpret the likelihood of statements in a particular hypothesis, over and above what is directly stated (e.g., Sher & McKenzie Reference Sher, McKenzie, Chater and Oaksford2008). Such approaches, however, are often unsatisfactory. Arbitrary interpretations of the relevant probabilistic mechanism are unlikely to generalize to related empirical situations (e.g., disjunction fallacies). Also, the introduction of post-hoc parameters will lead to models that are descriptive and limited in insight. Thus, employing a formal framework in arbitrarily flexible ways to cover problematic findings is possible, but of arguable explanatory value, and it also inevitably leads to criticism (Jones & Love Reference Jones and Love2011). But are the findings we considered particularly problematic for CP theory?

CP theory is a formal framework; that is, a set of interdependent axioms that can be productively employed to lead to new relations. Therefore, when obtaining psychological evidence for a formal framework, we do not just support the particular principles under scrutiny. Rather, such evidence corroborates the psychological relevance of all possible relations that can be derived from the formal framework. For example, one cannot claim that one postulate from a formal framework is psychologically relevant, but another is not, and still maintain the integrity of the theory.

The ingenuity of Tversky, Kahneman, and their collaborators (Kahneman et al. Reference Kahneman, Slovic and Tversky1982; Shafir & Tversky Reference Shafir and Tversky1992; Tversky & Kahneman Reference Tversky and Kahneman1973) was exactly that they provided empirical tests of principles that are at the heart of CP theory, such as the law of total probability and the relation between conjunction and individual probabilities. Therefore, it is extremely difficult to specify any reasonable CP model consistent with their results, as such models simply lack the necessary flexibility. There is a clear sense that if one wishes to pursue a formal, probabilistic approach for the Tversky, Kahneman type of findings, then CP theory is not the right choice, even if it is not actually possible to disprove the applicability of CP theory to such findings.

4.2. Heuristics vs. formal probabilistic modeling

The critique of CP theory by Tversky, Kahneman and collaborators can be interpreted in a more general way, as a statement that the attempt to model cognition with any axiomatic set of principles is misguided. These researchers thus motivated their influential program involving heuristics and biases. Many of these proposals sought to relate generic memory or similarity processes to performance in decision making (e.g., the availability and representativeness heuristics; Tversky & Kahneman Reference Tversky and Kahneman1983). Other researchers have developed heuristics as individual computational rules. For example, Gigerenzer and Todd's (Reference Gigerenzer and Todd1999) “take the best” heuristic offers a powerful explanation of behavior in a particular class of problem-solving situations.

Heuristics, however well motivated, are typically isolated: confidence in one heuristic does not extend to other heuristics. Therefore, cognitive explanations based on heuristics are markedly different from ones based on a formal axiomatic framework. Theoretical advantages of heuristic models are that individual principles can be examined independently from each other and that no commitment has to be made regarding the overall alignment of cognitive process with the principles of a formal framework. Some theorists would argue that we can only understand cognition through heuristics. However, it is also often the case that heuristics can be re-expressed in a formal way or reinterpreted within CP or QP theory. For example, the heuristics from the Tversky and Kahneman research program, which were developed specifically as an alternative to CP models, often invoke similarity or memory processes, which can be related to order/context effects in QP theory. Likewise, failures of consequential reasoning in prisoner's dilemma (Tversky & Shafir Reference Tversky and Shafir1992) can be formalized with quantum interference effects.

The contrast between heuristic and formal probabilistic approaches to cognition is a crucial one for psychology. The challenge for advocates of the former is to specify heuristics that cannot be reconciled with formal probability theory (CP or QP). The challenge for advocates of the latter is to show that human cognition is overall aligned with the principles of (classical or quantum) formal theory.

4.3. Is QP theory more complex than CP theory?

We have discussed the features of QP theory, which distinguish it from CP theory. These distinctive features typically emerge when considering incompatible questions. We have also stated that QP theory can behave like CP theory for compatible questions (sect. 2.2.2). Accordingly, there might be a concern that QP theory is basically all of CP theory (for compatible questions) and a bit more, too (for incompatible ones), so that it provides a more successful coverage of human behavior simply because it is more flexible.

This view is incorrect. First, it is true that QP theory for compatible questions behaves a lot like CP theory. For example, for compatible questions, conjunction is commutative, Lüder's law becomes effectively identical to Bayes's law, and no overestimation of conjunction can be predicted. However, CP and QP theories can diverge, even for compatible questions. For example, quantum time-dependent models involving compatible questions can still lead to interference effects, which are not possible in classical theory (sect. 2.3). Although CP and QP theories share the key commonality of being formal frameworks for probabilistic inference, they are founded on different axioms and their structure (set theoretic vs. geometric) is fundamentally different. QP theory is subject to several restrictive constraints; however, these are different from the ones in CP theory.

For example, CP Markov models must obey the law of total probability, whereas dynamic QP models can violate this law. However, dynamic QP models must obey the law of double stochasticity, while CP Markov models can violate this law. Double stochasticity is a property of transition matrices that describes the probabilistic changes from an input to an output over time. Markov models require each column of a transition matrix to sum to unity (so that they are stochastic), but QP models require both each row and each column to sum to unity (so they are doubly stochastic). Double stochasticity sometimes fails and this rules out QP models (Busemeyer et al. Reference Busemeyer, Wang and Lambert-Mogiliansky2009; Khrennikov Reference Khrennikov2010).

Moreover, QP models have to obey the restrictive law of reciprocity, for outcomes defined by one-dimensional subspaces. According to the law of reciprocity, the probability of transiting from one vector to another is the same as the probability of transiting from the second vector to the first, so that the corresponding conditional probabilities have to be the same. Wang and Busemeyer (in press) directly tested this axiom, using data on question order, and found that it was upheld with surprisingly high accuracy.

More generally, a fundamental constraint of QP theory concerns Gleason's theorem, namely that probabilities have to be associated with subspaces via the equation

$$Prob\lpar A \vert \psi \rpar = \Vert P_{A} \vert \psi \rangle \Vert^{2}.$$

Finding that Gleason's theorem is psychologically implausible would rule out quantum models. A critic may wonder how one could test such general aspects of quantum theory. Recently, however, Atmanspacher and Römer (Reference Atmanspacher and Römer2012) were able to derive a test for a very general property of QP theory (related to Gleason's theorem). Specifically, they proposed that failures of commutativity between a conjunction and one of the constituent elements of the conjunction (i.e., A vs. A ∧ B) would preclude a Hilbert space representation for the corresponding problem. These are extremely general predictions and show the principled nature of QP theory approaches to cognitive modeling.

Even if at a broad level CP and QP theories are subject to analogous constraints, a critic may argue that it is still possible that QP models are more flexible (perhaps because of their form). Ultimately, the issue of relative flexibility is a technical one and can only be examined against particular models. So far, there has only been one such examination and, surprisingly, it concluded in favor of QP theory. Busemeyer et al. (Reference Busemeyer, Wang and Shiffrin2012) compared a quantum model with a traditional decision model (based on prospect theory) for a large data set, from an experiment by Barkan and Busemeyer (Reference Barkan and Busemeyer2003). The experiment involved choices between gambles, using a procedure similar to that used by Tversky and Shafir (Reference Tversky and Shafir1992) for testing the sure thing principle. The models were equated with respect to the number of free parameters. However, the models could still differ with respect to their complexity. Accordingly, Busemeyer et al. (Reference Busemeyer, Wang and Shiffrin2012) adopted a Bayesian procedure for model comparison, which evaluates models on the basis of both their accuracy and complexity. As Bayesian comparisons depend upon priors over model parameters, different priors were examined, including uniform and normal priors. For both priors, the Bayes's factor favored the QP model over the traditional model (on average, by a factor of 2.07 for normal priors, and by a factor of 2.47 for uniform priors).

Overall, QP theory does generalize CP theory in certain ways. For example, it allows both for situations that are consistent with commutativity in conjunction (compatible questions) and situations that are not (incompatible questions). However, QP theory is also subject to constraints that do not have an equivalent in CP theory, such as double stochasticity and reciprocity, and there is currently no evidence that specific QP models are more flexible than CP ones. The empirical question then becomes: which set of general constraints is more psychologically relevant. We have argued that QP theory is ideally suited for modeling empirical results that depend upon order/context or appear to involve some kind of extreme dependence that rules out classical composition. QP theory was designed by physicists to capture analogous phenomena in the physical world. However, QP theory does not always succeed, and there have been situations in which the assumptions of CP models are more in line with empirical results (Busemeyer et al. Reference Busemeyer, Wang and Townsend2006). Moreover, in some situations, the predictions from QP and CP models converge, and in such cases it is perhaps easier to employ CP models.

5. The rational mind

Beginning with Aristotle and up until recently, scholars have believed that humans are rational because they are capable of reasoning on the basis of logic. First, logic is associated with an abstract elegance and a strong sense of mathematical correctness. Second, logic was the only system for formal reasoning; therefore, scholars could not conceive of the possibility that reasoning could be guided by an alternative system. Logic is exactly this – logical – so how could there be an alternative system for rational reasoning? But this view turned out to be problematic. Considerable evidence accumulated that naïve observers do not typically reason with classical logic (Wason Reference Wason1960); therefore, classical logic could not be maintained as a theory of thinking.

Oaksford and Chater (Reference Oaksford and Chater2007; Reference Oaksford and Chater2009) made a compelling case against the psychological relevance of classical logic. The main problem is that classical logic is deductive, so that once a particular conclusion is reached from a set of premises, this conclusion is certain and cannot be altered by the addition of further premises. Of course, this is rarely true for everyday reasoning. The key aspect of everyday reasoning is its nonmonotonicity, as it is always possible to alter an existing conclusion with new evidence. Oaksford and Chater (Reference Oaksford and Chater2007; Reference Oaksford and Chater2009) advocated a perspective of Bayesian rationality, which was partly justified using Anderson's (Reference Anderson1990) rational analysis approach. According to rational analysis, psychologists should look for the behavior function that is optimal, given the goals of the cognitive agent and its environment. Oaksford and Chater's Bayesian rationality view has been a major contribution to the recent prominence of cognitive theories based on CP theory. For example, CP theories are often partly justified as rational theories of the corresponding cognitive problems, which makes them easier to promote than alternatives. For example, in categorization, the rational model of categorization (e.g., Sanborn et al. Reference Sanborn, Griffiths and Navarro2010) has been called, well, “rational.” By contrast, the more successful Generalized Context Model (Nosofsky Reference Nosofsky1984) has received less corresponding justification (Wills & Pothos Reference Wills and Pothos2012).

There has been considerable theoretical effort to justify the rational status of CP theory. We can summarize the relevant arguments under three headings: Dutch book, long-term convergence, and optimality. The Dutch book argument concerns the long-term consistency of accepting bets. If probabilities are assigned to bets in a way that goes against the principles of CP theory, then this guarantees a net loss (or gain) across time. In other words, probabilistic assignment inconsistent with CP theory leads to unfair bets (de Finetti et al. Reference de Finetti, Machi and Smith1993). Long-term convergence refers to the fact that if the true hypothesis has any degree of non-zero prior probability, then, in the long run, Bayesian inference will allow its identification. Finally, optimality is a key aspect of Anderson's (Reference Anderson1990) rational analysis and concerns the accuracy of probabilistic inference. According to advocates of CP theory, this is the optimal way to harness the uncertainty in our environment and make accurate predictions regarding future events and relevant hypotheses.

These justifications are not without problems. Avoiding a Dutch book requires expected value maximization, rather than expected utility maximization, that is, the decision maker is constrained to use objective values rather than personal utilities, when choosing between bets. However, decision theorists generally reject the assumption of objective value maximization and instead allow for subjective utility functions (Savage Reference Savage1954). This is essential, for example, in order to take into account the observed risk aversion in human decisions (Kahneman & Tversky Reference Kahneman and Tversky1979). When maximizing subjective expected utility, CP reasoning can fall prey to Dutch book problems (Wakker Reference Wakker2010). Long-term convergence is also problematic, because if the true hypothesis has a prior probability of zero, it can never be identified. This is invariably the case in Bayesian models, as it is not possible to assign a non-zero probability to all candidate hypotheses. Overall, a priori arguments, such as the Dutch book or long-term convergence, are perhaps appealing under simple, idealized conditions. However, as soon as one starts taking into account the complexity of human cognition, such arguments break down.

Perhaps the most significant a priori justification for the rationality of CP theory concerns optimality of predictions. If reasoning on the basis of CP theory is optimal, in the sense of predictive accuracy, then this seems to settle the case in favor of CP theory. For example, is it more accurate to consider Linda as just a bank teller, rather than as a bank teller and a feminist? By contrast, QP theory embodies a format for probabilistic inference which is strongly perspective and context dependent. For example, Linda may not seem like a bank teller initially, but from the perspective of feminism such a property becomes more plausible. However, optimality must be evaluated under the constraints and limited resources of the cognitive system (Simon Reference Simon1955).

The main problem with classical optimality is that it assumes a measurable, objective reality and an omniscient observer. Our cognitive systems face the problem of making predictions for a vast number of variables that can take on a wide variety of values. For the cognitive agent to take advantage of classical optimality, it would have to construct an extremely large joint probability distribution to represent all these variables (this is the principle of unicity). But for complex possibilities, it is unclear as to where such information would come from. For example, in Tversky and Kahneman's (Reference Tversky and Kahneman1983) experiment we are told about Linda, a person we have never heard of before. Classical theory would assume that this story generates a sample space for all possible characteristic combinations for Linda, including unfamiliar ones such as feminist bank teller. This just doesn't seem plausible, let alone practical, considering that for the bulk of available knowledge, we have no relevant experience. It is worth noting that Kolmogorov understood this limitation of CP theory (Busemeyer & Bruza Reference Busemeyer and Bruza2012, Ch. 12). He pointed out that his axioms apply to a sample space from a single experiment and that different experiments require new sample spaces. But his admonitions were not formalized, and CP modelers do not take them into account.

Quantum theory assumes no measurable objective reality; rather judgment depends on context and perspective. The same predicate (e.g., that Linda is a bank teller) may appear plausible or not, depending upon the point of view (e.g., depending on whether we accept Linda as a feminist or not). Note that QP theory does assume systematic relations between different aspects of our knowledge, in terms of the angle (and relative dimensionality) between different subspaces. However, each inference changes the state vector, and, therefore, the perspective from which all other outcomes can be evaluated. Note also that context effects in QP theory are very different from conditional probabilities in CP theory. The latter are still assessed against a common sample space. With the former, the sample space for a set of incompatible outcomes changes every time an incompatible question is evaluated (as this changes the basis for evaluating the state).

If we cannot assume an objective reality and an omniscient cognitive agent, then perhaps the perspective-driven probabilistic evaluation in quantum theory is the best practical rational scheme. In other words, quantum inference is optimal, for when it is impossible to assign probabilities to all relevant possibilities and combinations concurrently. This conclusion resonates with Simon's (Reference Simon1955) influential idea of bounded rationality, according to which cognitive theory needs to incorporate assumptions about the computational burden which can be supported by the human brain. For example, classically, the problem of assessing whether Linda is a feminist and a bank teller requires the construction of a bivariate joint probability space, which assigns a probability density for each outcome regarding these questions. By contrast, a QP representation is simpler: it requires a univariate amplitude distribution for each question, and the two distributions can be related through a rotation. As additional questions are considered (e.g., whether Linda might be tall or short) the efficiency of the QP representation becomes more pronounced. Note that classical schemes could be simplified by assuming independence between particular outcomes. However, independence assumptions are not appropriate for many practical situations and will introduce errors in inference.

Note that the perspective dependence of probabilistic assessment in QP theory may seem to go against an intuition that “objective” (classical) probabilities are somehow more valid or correct. However, this same probabilistic scheme does lead to more accurate predictions in the physical world, in the context of quantum physics. If the physical world is not “objective” enough for CP theory to be used, there is a strong expectation that the mental world, with its qualities of flux and interdependence of thoughts, would not be also.

The application of QP theory to cognition implies a strong interdependence between thoughts, such that it is typically not possible to have one thought without repercussions for other thoughts. These intuitions were extensively elaborated in the work of Fodor (Reference Fodor1983), with his proposals that thought is isotropic and Quinean, so that revising or introducing one piece of information can in principle impact on most other information in our knowledge base. Oaksford and Chater (Reference Oaksford and Chater2007; Reference Oaksford and Chater2009) argued that it is exactly such characteristics of thought that make CP theory preferable to classical logic for cognitive modeling. However, Fodor's (Reference Fodor1983) arguments also seem to go against the neat reductionism in CP theory, required by the principle of unicity and the law of total probability, according to which individual thoughts can be isolated from other, independent ones, and the degree of interdependence is moderated by the requirement to always have a joint probability between all possibilities. QP theory is not subject to these constraints.

Overall, accepting a view of rationality inconsistent with classical logic was a major achievement accomplished by CP researchers (e.g., Oaksford & Chater Reference Oaksford and Chater2007; Reference Oaksford and Chater2009). For example, how can it be that in the Wason selection task the “falsificationist” card choices are not the best ones? Likewise, accepting a view of rationality at odds with CP theory is the corresponding challenge for QP researchers. For example, how could it not be that Prob(A ∧ B)=Prob(B ∧ A)? The principles of CP theory have been accepted for so long that they are considered self evident. However, one of our objectives in Section 3 was exactly to show how QP theory can lead to alternative, powerful intuitions about inference, intuitions that emphasize the perspective-dependence of any probabilistic conclusion. We conclude with an interesting analogy. Classical logic can be seen as a rational way of thinking, but only in idealized situations in which deductive inference is possible, that is, such that there are no violations of monotonicity. CP theory inference can also be seen as rational, but only in idealized situations in which the requirements from the principle of unicity match the capabilities of the observers (i.e., the possibilities that require probabilistic characterization are sufficiently limited). For the real, noisy, confusing, ever-changing, chaotic world, QP is the only system that works in physics and, we strongly suspect, in psychology as well.

6. Concluding comments

6.1 Theoretical challenges

The results of Tversky, Kahneman, and colleagues (e.g., Tversky & Kahneman Reference Tversky and Kahneman1974) preclude a complete explanation of cognitive processes with CP theory. We have suggested that QP theory is the appropriate framework to employ for cases in which CP theory fails. QP and CP theories are closely related and, also, the kind of models produced from CP and QP theories are analogous. Therefore, it could be proposed that using CP and QP theories together, a complete explanation of cognitive processes would emerge.

In exploring such a proposal, the first step should be to identify the precise boundary conditions between the applicability and failure of CP principles in cognitive modeling. In other words, there is no doubt that in some cases cognitive process does rely on CP principles (perhaps the same can also be said for classical logic principles). The results of Tversky et al. (Reference Tversky and Kahneman1974) reveal situations in which this reliance breaks down. A key theoretical challenge concerns understanding the commonalities between the experimental situations that lead to failures in the applicability of classical principles. For example, what exactly triggers the relevance of a quantum probabilistic reasoning approach in situations as diverse as the Linda problem, violations of the sure thing principle in prisoner's dilemma, and sets of questions concerning United States presidential candidates? A preliminary suggestion is that perhaps the more idealized the reasoning situation (e.g., is it cognitively feasible to apply the unicity principle?), the greater the psychological relevance of CP theory.

Another challenge concerns further understanding the rational properties of quantum inference. The discussion in Section 5 focused on the issue of accuracy, assuming that the requirements from the principle of unicity have to be relaxed. However, there is a further, potentially relevant literature on quantum information theory (Nielsen & Chuang Reference Nielsen and Chuang2000), which concerns the processing advantages of probabilistic inference based on QP theory. For example, a famous result by Grover (Reference Grover1997) shows how a quantum search algorithm will outperform any classical algorithm. The potential psychological relevance of such results (e.g., in categorization theory) is an issue for much further work (e.g., is it possible to approximate quantum information algorithms in the brain?). These are exciting possibilities regarding both the rational basis of quantum cognitive models and the general applicability of quantum theory to cognitive theory.

6.2. Empirical challenges

So far, the quantum program has involved employing quantum computational principles to explain certain, prominent empirical findings. Such quantum models do not simply – redescribe of results that have already had (some) compelling explanation. Rather, we discussed results that have presented ongoing challenges and have resisted explanation based on classical principles. One objective for future work is to continue identifying empirical situations that are problematic from a classical perspective.

Another objective is to look for new, surprising predictions, which take advantage of the unique properties of quantum theory, such as superposition, incompatibility, and entanglement. For example, Trueblood and Busemeyer (Reference Trueblood and Busemeyer2011) developed a model to accommodate order effects in the assessment of evidence in McKenzie et al.'s (Reference McKenzie, Lee and Chen2002) task. The model successfully described data from both the original conditions and a series of relevant extensions. Moreover, Wang and Busemeyer (in press) identified several types of order effects that can occur in questionnaires, such as consistency and contrast (Moore Reference Moore2002). Their quantum model was able to make quantitative, parameter-free predictions for these order effects. In perception, Atmanspacher and Filk (Reference Atmanspacher and Filk2010) proposed an experimental paradigm for bistable perception, so as to test the predictions from their quantum model regarding violations of the temporal Bell inequality (such violations are tests of the existence of superposition states).

Overall, understanding the quantum formalism to the extent that surprising, novel predictions for cognition can be generated is no simple task (in physics, this was a process that took several decades). The current encouraging results are a source of optimism.

6.3. Implications for brain neurophysiology

An unresolved issue is how QP computations are implemented in the brain. We have avoided a detailed discussion of this research area because, although exciting, is still in its infancy. One perspective is that the brain does not instantiate any quantum computation at all. Rather, interference effects in the brain can occur if neuronal membrane potentials have wave-like properties, a view that has been supported in terms of the characteristics of electroencephalographic (EEG) signals (de Barros & Suppes Reference de Barros and Suppes2009). Relatedly, Ricciardi and Umezawa (Reference Ricciardi and Umezawa1967), Jibu & Yasue (Reference Jibu and Yasue1995), and Vitiello (Reference Vitiello1995) developed a quantum field theory model of human memory, which still allows a classical description of brain activity. The most controversial (Atmanspacher Reference Atmanspacher2004; Litt et al. Reference Litt, Eliasmith, Kroon, Weinstein and Thagard2006) perspective is that the brain directly supports quantum computations. For quantum computation to occur, a system must be isolated from the environment, as environmental interactions cause quantum superposition states to rapidly decohere into classical states. Penrose (Reference Penrose1989) and Hammeroff (Reference Hammeroff1998) suggested that microtubules prevent decoherence for periods of time long enough to enable meaningful quantum computation; in this view, the collapse of superposition states is associated with experiences of consciousness.

Overall, in cognitive science it has been standard to initially focus on identifying the mathematical principles underlying cognition, and later address the issue of how the brain can support the postulated computations. However, researchers have been increasingly seeking bridges between computational and neuroscience models. Regarding the QP cognitive program, this is clearly an important direction for future research.

6.4. The future of QP theory in psychology

There is little doubt that extensive further work is essential before all aspects of QP theory can acquire psychological meaning. But this does not imply that current QP models are not satisfactory. In fact, we argue that the quantum approach to cognition embodies all the characteristics of good cognitive theory: it is based on a coherent set of formal principles, the formal principles are related to specific assumptions about psychological process (e.g., the existence of order/context effects in judgment), and it leads to quantitative computational models that can parsimoniously account for both old and new empirical data. The form of quantum cognitive theories is very much like that of CP ones, and the latter have been hugely influential in recent cognitive science. The purpose of this article is to argue that researchers attracted to probabilistic cognitive models need not be restricted to classical theory. Rather, quantum theory provides many theoretical and practical advantages, and its applicability to psychological explanation should be further considered.

Appendix

An elaboration of some of the basic definitions in QP theory

(See Busemeyer & Bruza Reference Busemeyer and Bruza2012 for an extensive introduction)

Projectors (or projection operators)

Projectors are idempotent linear operators. For a one-dimensional subspace, corresponding, for example, to the |happy〉 ray, the projector is a simple outer product, P _happy = |happy〉〈happy|. Note that |happy〉 corresponds to a column vector and 〈happy| denotes the corresponding row vector (with conjugation).

Given the above subspace for “happy,” the probability that a person is happy is given by

$$\Vert P_{happy} \vert \psi \rangle \Vert^{2} = \Vert happy \rangle\ \langle happy \vert \psi \rangle \Vert^{2}.$$

In this expression, 〈happy|Ψ〉 is the standard dot (inner) product and |happy〉 is a unit length vector. Therefore,

$$\Vert P_{happy} \vert \psi \rangle \Vert^{2} = \vert \langle happy \vert \psi \rangle \vert^{2}.$$

The double lines on the left hand side of this equation denote the length of the vector, so that ||P _happy |Ψ〉||² is the length of the vector squared. The single lines on the right hand side denote the modulus (magnitude) of a complex number. Therefore, in a real space, we can simply write

$$\Vert P_{happy} \vert \psi \rangle \Vert^{2} = \lpar \langle happy \vert \psi \rangle\rpar ^{2}.$$

Composite systems

Two subspaces can be combined into a composite space in two ways: one way is by forming a tensor product space (as in Figure 1b) and the other way is by forming a space from a direct sum. First consider the formation of a tensor product space. For example, suppose |happy〉, |~happy〉 are two basis vectors that span the subspace H, representing the possibility of happiness, and suppose |employed〉, |~employed〉 are two basis vectors that span the subspace E, representing the possibility of employment. Then, the tensor product space equals the span of the four basis vectors formed by the tensor products

$$\eqalign{& \lcub \vert happy\rangle \otimes \vert employed\rangle\comma \; \vert happy\rangle \otimes \vert {\sim} employed\rangle\comma \; \cr & \vert {\sim} happy\rangle \otimes \vert employed\rangle\comma \; \vert {\sim} happy\rangle \otimes \vert {\sim} employed\rangle \rcub .}$$

Next consider the formation of a space by direct sum. For example, suppose the subspace E is spanned by the basis vectors

$$\lcub \vert happy\rangle \otimes \vert employed\rangle\comma \; \vert {\sim} happy\rangle \otimes \vert employed\rangle\rcub$$

and suppose ~E is the subspace spanned by the basis vectors

$$\lcub \vert happy\rangle \otimes \vert {\sim} employed \rangle\comma \; \vert {\sim} happy\rangle \otimes \vert {\sim} employed\rangle\rcub .$$

Then the direct sum space is formed by all possible pairs of vectors, one from E and another from ~E.

Time dependence

The quantum state vector changes over time according to Schrödinger's equation,

$$\displaystyle{d \over {dt}}\vert \psi \lpar t\rpar \rangle=- i \cdot H \cdot \vert \psi \lpar t\rpar \rangle$$

where H is a Hermitian linear operator. This is the QP theory equivalent of the Kolmogorov forward equation for Markov models in CP theory. The solution to Schrödinger's equation equals

$$\psi_{2}\lpar t\rpar = e^{-i\cdot t\cdot H} \cdot \psi_1 =U\lpar t\rpar \cdot \psi_1$$

where H is a Hermitian operator and U(t) is a unitary one (note that i ²⁼–1). The two (obviously related) operators H and U(t) contain all the information about the dynamical aspects of a system. The key property of U(t) is that it preserves lengths, so that

$$\langle U\lpar t\rpar \cdot \psi \vert U\lpar t\rpar \cdot \Phi \rangle = \langle \psi \vert \Phi \rangle.$$

Thus, the effect of U(t) on a state vector is to rotate it in a way that captures some dynamical aspect of the situation of interest.

An example of how interference can arise in QP theory

Consider a situation whereby a person tries to assess whether she is happy or not, depending upon whether she is employed or not. We can write

$$\vert Initial\ state \rangle = {1 \over \sqrt{2}} \vert \psi_{employed} \rangle + {1 \over \sqrt{2}} \vert \psi_{\sim employed} \rangle$$

(this corresponds to a direct sum decomposition). Assume that this initial state develops in time with a unitary matrix, U(t), which could correspond to the thought process of weighting the implications of being and not being employed for happiness (Pothos & Busemeyer Reference Pothos and Busemeyer2009; Trueblood & Busemeyer Reference Trueblood and Busemeyer2011), so we end up with

$$\vert final\ state\rangle = U\lpar t\rpar \cdot {1 \over \sqrt{2}}\vert \psi_{employed} \rangle + U\lpar t\rpar \cdot {1 \over \sqrt{2}}\vert \psi_{\sim employed} \rangle.$$

Note that so far the situation is identical to what we would have had if we were applying a CP theory Markov model.

The difference between the Markov CP model and the QP one is in how probabilities are computed from the time-evolved state. Consider the probability for being happy; this can be extracted from a state by applying a projector operator M that picks out the coordinates for being happy. In the QP case,

$$\eqalign{&Prob\lpar happy\comma \; unknown\ employment\rpar \cr &\quad= \Vert M \cdot U\lpar t\rpar \cdot {1 \over \sqrt{2}}\vert \psi_{employed} \rangle + M \cdot U\lpar t\rpar \cdot {1 \over \sqrt{2}}\vert \psi_{\sim employed} \rangle \Vert^2}$$

That is, as has been discussed, probabilities are computed from amplitudes through a squaring operation. This nonlinearity in QP theory can lead to interference terms that produce violations of the law of total probability. Specifically,

$$\eqalign{&Prob\lpar happy\comma \; unknown\ employment\rpar \cr &= Prob\lpar happy \wedge employed\rpar + Prob\lpar happy \wedge \sim\! employed\rpar \cr & \quad+ Interference\ terms.}$$

As the interference terms can be negative, the law of total probability can be violated.

Suppose next that the person is determined to find out whether she will be employed or not, before having this inner reflection about happiness (perhaps she intends to delay thinking about happiness until after her professional review). Then, the state after learning about her employment will be either | Ψ _employed〉 or | Ψ ~ _employed〉. Therefore, what will evolve is one of these two states. Therefore, for example,

$$Prob\lpar happy \vert employed\rpar = \vert M\cdot U\lpar t\rpar \cdot \vert \psi_{employed} \rangle \vert^{2}$$

and

$$Prob\lpar happy \vert {\sim} employed\rpar = \vert M\cdot U\lpar t\rpar \cdot \vert \psi \sim_{employed} \rangle \vert^{2}.$$

Overall, in this case,

$$\eqalign{&Prob\lpar happy\comma \; unknown\ employment\rpar \cr &= \vert M\cdot U\lpar t\rpar \cdot \vert \psi_{employed} \rangle \vert^{2}\cdot Prob\lpar employed\rpar \cr &\quad + \vert M \cdot U\lpar t\rpar \cdot \vert \psi \sim_{employed} \rangle \vert^{2} \cdot Prob\lpar \sim\! employed\rpar .}$$

It should be clear that in such a case there are no interference terms and the quantum result converges to the classical one. Note that the “quantum” reasoner is still uncertain about whether she will be employed. The crucial difference is that in this case she knows she will resolve the uncertainty regarding employment, before her inner reflection. Therefore, regardless of the outcome regarding employment, the evolved state will be a state that is not a superposition one.

ACKNOWLEDGMENTS

We are grateful to Diederik Aerts, Harald Atmanspacher, Thomas Filk, James Hampton, Mike Oaksford, Steven Sloman, Jennifer Trueblood, and Christoph Weidemann for their helpful comments. Jerome Busemeyer was supported by the grant NSF ECCS-1002188.

References

Aerts, D. (2009) Quantum structure in cognition. Journal of Mathematical Psychology 53:314–48.Google Scholar

Aerts, D. & Aerts, S. (1995) Applications of quantum statistics in psychological studies of decision processes. Foundations of Science 1:85–97.Google Scholar

Aerts, D. & Gabora, L. (2005b) A theory of concepts and their combinations II: A Hilbert space representation. Kybernetes 34:192–221.CrossRef Google Scholar

Aerts, D. & Sozzo, S. (2011b) Quantum structure in cognition: Why and how concepts are entangled. In: Proceedings of the Quantum Interaction Conference, pp. 118–29. Springer.Google Scholar

Anderson, J. R. (1990) The adaptive character of thought. Erlbaum.Google Scholar

Anderson, J. R. (1991) The adaptive nature of human categorization. Psychological Review 98:409–29.Google Scholar

Anderson, N. (1971) Integration theory and attitude change. Psychological Review 78:171–206.CrossRef Google Scholar

Ashby, F. G. & Perrin, N. A. (1988) Towards a unified theory of similarity and recognition. Psychological Review 95:124–50.Google Scholar

Atmanspacher, H. (2004) Quantum theory and consciousness: An overview with selected examples. Discrete Dynamics 8:51–73.Google Scholar

Atmanspacher, H. & Filk, T. (2010) A proposed test of temporal nonlocality in bistable perception. Journal of Mathematical Psychology 54:314–21.CrossRef Google Scholar

Atmanspacher, H., Filk, T. & Romer, H. (2004) Quantum zero features of bistable perception. Biological Cybernetics 90:33–40.Google Scholar

Atmanspacher, H. & Römer, H. (2012) Order effects in sequential measurements of non-commuting psychological observables. Journal of Mathematical Psychology 56:274–80.Google Scholar

Atmanspacher, H., Römer, H. & Walach, H. (2002) Weak quantum theory: Complementarity and entanglement in physics and beyond. Foundations of Physics 32:379–406.Google Scholar

Baaquie, B. E. (2004) Quantum finance: Path integrals and Hamiltonians for options and interest rates. Cambridge University Press.Google Scholar

Bar-Hillel, M. & Neter, E. (1993) How alike is it versus how likely is it: A disjunction fallacy in probability judgments. Journal of Personality and Social Psychology 65:1119–31.Google Scholar

Barkan, R. & Busemeyer, J. R. (2003) Modeling dynamic inconsistency with a changing reference point. Journal of Behavioral Decision Making 16:235–55.Google Scholar

Bergus, G. R., Chapman, G. B., Levy, B. T., Ely, J. W. & Oppliger, R. A. (1998) Clinical diagnosis and order information. Medical Decision Making 18:412–17.CrossRef Google Scholar

Birnbaum, M. H. (2008) New paradoxes of risky decision making. Psychological Review 115:463–501.CrossRef Google Scholar PubMed

Blutner, R. (2009) Concepts and bounded rationality: An application of Niestegge's approach to conditional quantum probabilities. In: Foundations of probability and physics-5, ed. Acardi, L. E. A., Adenier, G., Fuchs, C., Jaeger, G., Khrennikov, A. Y., Larsson, J.-Å. & Stenholm, S., pp. 302–10. American Institute of Physics Conference Proceedings.Google Scholar

Bordley, R. F. (1998) Quantum mechanical and human violations of compound probability principles: Toward a generalized Heisenberg uncertainty principle. Operations Research 46:923–26.CrossRef Google Scholar

Brainerd, C. J. & Reyna, V. F. (2008) Episodic over-distribution: A signature effect of familiarity without recognition. Journal of Memory & Language 58:765–86.Google Scholar

Brainerd, C. J., Reyna, V. F. & Ceci, S. J. (2008) Developmental reversals in false memory: A review of data and theory. Psychological Bulletin 134:343–82.CrossRef Google Scholar PubMed

Brainerd, C. J., Reyna, V. F. & Mojardin, A. H. (1999) Conjoint recognition. Psychological Review 106:160–79.Google Scholar

Bruza, P. D., Kitto, K., Nelson, D. & McEvoy, C. L. (2009) Is there something quantum-like about the human mental lexicon? Journal of Mathematical Psychology 53:362–77.Google Scholar

Busemeyer, J. R. & Bruza, P. D. (2012) Quantum models of cognition and decision. Cambridge University Press.Google Scholar

Busemeyer, J. R., Matthew, M. & Wang, Z. A. (2006a) Quantum game theory explanation of disjunction effects. In: Proceedings of the 28th Annual Conference of the Cognitive Science Society, ed. Sun, R. & Miyake, N., pp. 131–35. Erlbaum.Google Scholar

Busemeyer, J. R., Pothos, E. M., Franco, R. & Trueblood, J. S. (2011) A quantum theoretical explanation for probability judgment errors. Psychological Review 118(2):193–218.Google Scholar

Busemeyer, J. R., Wang, J. & Shiffrin, R. M. (2012) Bayesian model comparison of quantum versus traditional models of decision making for explaining violations of the dynamic consistency principle. Paper presented at Foundations and Applications of Utility, Risk and Decision Theory, Atlanta, Georgia.Google Scholar

Busemeyer, J. R., Wang, Z. & Lambert-Mogiliansky, A. (2009) Comparison of Markov and quantum models of decision making. Journal of Mathematical Psychology 53:423–33.Google Scholar

Busemeyer, J. R., Wang, Z. & Townsend, J. T. (2006) Quantum dynamics of human decision-making. Journal of Mathematical Psychology 50:220–41.Google Scholar

Carlson, B. W. & Yates, J. F. (1989) Disjunction errors in qualitative likelihood judgment. Organizational Behavior and Human Decision Processes 44:368–79.Google Scholar

Conte, E., Khrennikov, A. Y., Todarello, O., Federici, A., Mendolicchio, L. & Zbilut, J. P. (2009) Mental states follow quantum mechanics during perception and cognition of ambiguous figures. Open Systems and Information Dynamics 16:1–17.Google Scholar

Croson, R. (1999) The disjunction effect and reason-based choice in games. Organizational Behavior and Human Decision Processes 80:118–33.CrossRef Google Scholar PubMed

de Barros, J. A. & Suppes, P. (2009) Quantum mechanics, interference, and the brain. Journal of Mathematical Psychology 53:306–13.Google Scholar

de Finetti, B., Machi, A. & Smith, A. (1993) Theory of probability: A critical introductory treatment. Wiley.Google Scholar

Feldman, J. M. & Lynch, J. G. (1988) Self-generated validity and other effects of measurement on belief, attitude, intention, and behavior. Journal of Applied Psychology 73:421–35.Google Scholar

Festinger, L. (1957) A theory of cognitive dissonance. Stanford University Press.Google Scholar

Fine, A. (1982) Joint distributions, quantum correlations, and commuting observables. Journal of Mathematical Physics 23:1306–10.Google Scholar

Fodor, J. A. (1983) The modularity of mind. The MIT Press.CrossRef Google Scholar

Gavanski, I. & Roskos-Ewoldsen, D. R. (1991) Representativeness and conjoint probability. Journal of Personality and Social Psychology 61:181–94.Google Scholar

Gigerenzer, G. & Todd, P. M. (1999) Simple heuristics that make us smart. Oxford University Press.Google Scholar

Goldstone, R. L. (1994) Similarity, interactive activation, and mapping. Journal of Experimental Psychology: Learning, Memory and Cognition 20:3–28.Google Scholar

Griffiths, R. B. (2003) Consistent quantum theory. Cambridge University Press.Google Scholar

Griffiths, T. L., Chater, N., Kemp, C., Perfors, A. & Tenenbaum, J. B. (2010) Probabilistic models of cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences 14:357–64.CrossRef Google Scholar PubMed

Grover, L. K. (1997) Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters 79:325–28.Google Scholar

Hahn, U., Chater, N. & Richardson, L. B. (2003) Similarity as transformation. Cognition 87:1–32.Google Scholar

Hammeroff, S. R. (1998) Quantum computation in brain microtubules? The Penrose-Hammeroff “orch-or” model of consciousness. Philosophical Transactions of the Royal Society A 356:1869–96.Google Scholar

Hameroff, S. R. (2007) The brain is both neurocomputer and quantum computer. Cognitive Science 31:1035–45.Google Scholar

Hampton, J. A. (1988a) Disjunction of natural concepts. Memory & Cognition 16:579–91.Google Scholar

Hampton, J. A. (1988b) Overextension of conjunctive concepts: Evidence for a unitary model for concept typicality and class inclusion. Journal of Experimental Psychology: Learning, Memory, and Cognition 14:12–32.Google Scholar

Hogarth, R. M. & Einhorn, H. J. (1992) Order effects in belief updating: The belief-adjustment model. Cognitive Psychology 24:1–55.CrossRef Google Scholar

Hughes, R. I. G. (1989) The structure and interpretation of quantum mechanics. Harvard University Press.CrossRef Google Scholar

Isham, C. J. (1989) Lectures on quantum theory. World Scientific.Google Scholar

Jibu, M. & Yasue, K. (1995) Quantum brain dynamics and consciousness. Benjamins.Google Scholar

Johnson, E. J., Haubl, G. & Keinan, A. (2007) Aspects of endowment: A query theory of value construction. Journal of Experimental Psychology: Learning, Memory and Cognition 33(3):461–73.Google Scholar

Jones, M. & Love, B. C. (2011) Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition. Behavioral and Brain Sciences 34:169–231.Google Scholar

Kahneman, D., Slovic, P. & Tversky, A. (1982) Judgment under uncertainty: Heuristics and biases. Cambridge University Press.Google Scholar

Kahneman, D. & Tversky, A. (1979) Prospect theory: An analysis of decision under risk. Econometrica 47:263–91.Google Scholar

Khrennikov, A. Y. (2010) Ubiquitous quantum structure: From psychology to finance. Springer.Google Scholar

Kolmogorov, A. N. (1933/1950) Foundations of the theory of probability. Chelsea Publishing Co.Google Scholar

Krueger, J. I., DiDonato, T. E. & Freestone, D. (2012) Social projection can solve social dilemmas. Psychological Inquiry 23:1–27.CrossRef Google Scholar

Krumhansl, C. L. (1978) Concerning the applicability of geometric models to similarity data: The interrelationship between similarity and spatial density. Psychological Review 85:445–63.Google Scholar

Lambert-Mogiliansky, A., Zamir, S. & Zwirn, H. (2009) Type indeterminacy: A model of the KT(Kahneman–Tversky)-man. Journal of Mathematical Psychology 53(5):349–61.Google Scholar

Li, S. & Taplin, J. (2002) Examining whether there is a disjunction effect in prisoner's dilemma games. Chinese Journal of Psychology 44:25–46.Google Scholar

Litt, A., Eliasmith, C., Kroon, F. W., Weinstein, S. & Thagard, P. (2006) Is the brain a quantum computer? Cognitive Science 30:593–603.CrossRef Google Scholar

Markman, A. B. & Gentner, D. (1993) Splitting the differences: A structural alignment view of similarity. Journal of Memory and Language 32:517–35.Google Scholar

Marr, D. (1982) Vision: A computational investigation into the human representation and processing of visual information. W. H. Freeman.Google Scholar

McKenzie, C. R. M., Lee, S. M. & Chen, K. K. (2002) When negative evidence increases confidence: Change in belief after hearing two sides of a dispute. Journal of Behavioral Decision Making 15:1–18.Google Scholar

Moore, D. W. (2002) Measuring new types of question-order effects. Public Opinion Quarterly 66:80–91.Google Scholar

Nielsen, M. A. & Chuang, I. L. (2000) Quantum computation and quantum information. Cambridge University Press.Google Scholar

Nosofsky, R. M. (1984) Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, Memory & Cognition 10:104–14.Google Scholar

Oaksford, M. & Chater, N. (2007) Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press.Google Scholar

Oaksford, M. & Chater, N. (2009) Pre'cis of Bayesian rationality: The probabilistic approach to human reasoning. Behavioral and Brain Sciences 32:69–120.Google Scholar

Penrose, R. (1989) The emperor's new mind. Oxford University Press.Google Scholar

Perfors, A., Tenenbaum, J. B., Griffiths, T. L. & Xu, F. (2011) A tutorial introduction to Bayesian models of cognitive development. Cognition 120:302–21.Google Scholar

Pothos, E. M. & Busemeyer, J. R. (2009) A quantum probability explanation for violations of “rational” decision theory. Proceedings of the Royal Society B 276:2171–78.Google Scholar

Pothos, E. M. & Busemeyer, J. R. (2011) A quantum probability explanation for violations of symmetry in similarity judgments. In: Proceedings of the 32nd Annual Conference of the Cognitive Science Society, pp. 2848–54. LEA.Google Scholar

Redei, M. & Summers, S. J. (2007) Quantum probability theory. Studies in the History and Philosophy of Modern Physics 38:390–417.CrossRef Google Scholar

Reyna, V. F. (2008) A theory of medical decision making and health: Fuzzy trace theory. Medical Decision Making 28:850–65.Google Scholar

Reyna, V. F. & Brainerd, C. J. (1995) Fuzzy-trace theory: An interim synthesis. Learning and Individual Differences 7:1–75.Google Scholar

Ricciardi, L. M. & Umezawa, H. (1967) Brain and physics of many bodied problems. Kybernetik 4:44–48.Google Scholar

Sanborn, A. N., Griffiths, T. L. & Navarro, D. J. (2010) Rational approximations to rational models: Alternative algorithms for category learning. Psychological Review 117:1144–67.Google Scholar

Savage, L. (1954) The foundations of statistics. Wiley.Google Scholar

Schuman, H. & Presser, S. (1981) Questions and answers in attitude surveys: Experiments on question form, wording, and content. Academic Press.Google Scholar

Schwarz, N. (2007) Attitude construction: Evaluation in context. Social Cognition 25:638–56.Google Scholar

Shafer, G. & Tversky, A. (1985) Languages and designs for probability judgment. Cognitive Science 9:309–39.Google Scholar

Shafir, E. & Tversky, A. (1992) Thinking through uncertainty: nonconsequential reasoning and choice. Cognitive Psychology 24:449–74.Google Scholar

Shanteau, J. C. (1970) An additive model for sequential decision making. Journal of Experimental Psychology 85:181–191.Google Scholar

Sher, S. & McKenzie, C. R. M. (2008) Framing effects and rationality. In: The probabilistic mind: Prospects for Bayesian cognitive science, ed. Chater, N. & Oaksford, M., pp. 79–96. Oxford University Press.Google Scholar

Sides, A., Osherson, D., Bonini, N. & Viale, R. (2002) On the reality of the conjunction fallacy. Memory and Cognition 30:191–98.CrossRef Google Scholar PubMed

Simon, H. A. (1955) A behavioral model of rational choice. The Quarterly Journal of Economics 69:99–118.CrossRef Google Scholar

Sloman, S. A. (1993) Feature-based induction. Cognitive Psychology 25:231–80.Google Scholar

Smolensky, P. (1990) Tensor product variable binding and the representation of symbolic structures in connectionist networks. Artificial Intelligence 46:159–216.Google Scholar

Stolarz-Fantino, S., Fantino, E., Zizzo, D. J. & Wen, J. (2003) The conjunction effect: New evidence for robustness. American Journal of Psychology 116(1):15–34.CrossRef Google Scholar PubMed

Tenenbaum, J. B. & Griffiths, T. L. (2001) The rational basis of representativeness. In: Proceedings of the 23rd Annual Conference of the Cognitive Science Society, pp. 1036–41.Google Scholar

Tenenbaum, J. B., Kemp, C., Griffiths, T. L. & Goodman, N. (2011) How to grow a mind: Statistics, structure, and abstraction. Science 331:1279–85.Google Scholar

Tentori, K. & Crupi, V. (2012) On the conjunction fallacy and the meaning of and, yet again: A reply to Hertwig, Benz, and Krauss (2008). Cognition 122:123–34.Google Scholar

Tourangeau, R., Rips, L. J. & Rasinski, K. A. (2000) The psychology of survey response. Cambridge University Press.Google Scholar

Townsend, J. T., Silva, K. M., Spencer-Smith, J. & Wenger, M. (2000) Exploring the relations between categorization and decision making with regard to realistic face stimuli. Pragmatics and Cognition 8:83–105.Google Scholar

Trueblood, J. S. & Busemeyer, J. R. (2011) A comparison of the belief-adjustment model and the quantum inference model as explanations of order effects in human inference. Cognitive Science 35(8):1518–52.CrossRef Google Scholar

Tversky, A. (1977) Features of similarity. Psychological Review 84(4):327–52.Google Scholar

Tversky, A. & Kahneman, D. (1973) Availability: A heuristic for judging frequency and probability. Cognitive Psychology 5:207–32.Google Scholar

Tversky, A. & Kahneman, D. (1974) Judgment under uncertainty: Heuristics and biases. Science 185:1124–31.Google Scholar

Tversky, A. & Kahneman, D. (1983) Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review 90(4): 293–315.Google Scholar

Tversky, A. & Koehler, D. J. (1994) Support theory: A nonextensional representation of subjective probability. Psychological Review 101:547–67.Google Scholar

Tversky, A. & Shafir, E. (1992) The disjunction effect in choice under uncertainty. Psychological Science 3:305–309.Google Scholar

Vitiello, G. (1995) Dissipation and memory capacity in the quantum brain model. International Journal of Modern Physics B9:973–89.Google Scholar

Wakker, P. P. (2010) Prospect theory for risk and ambiguity. Cambridge University Press.Google Scholar

Walker, L., Thibaut, J. & Andreoli, V. (1972) Order of presentation at trial. Yale Law Journal 82:216–26.Google Scholar

Wang, Z. & Busemeyer, J. R. (in press) A quantum question order model supported by empirical tests of an a priori and precise prediction. Topics in Cognitive Science.Google Scholar

Wang, Z. J., Busemeyer, J. R., Atmanspacher, H. & Pothos, E. M. (in press) The potential for using quantum theory to build models of cognition. Topics in Cognitive Science.Google Scholar

Wason, P. C. (1960) On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology 12:129–40.Google Scholar

Wedell, D. H. & Moro, R. (2008) Testing boundary conditions for the conjunction fallacy: Effects of response mode, conceptual focus, and problem type. Cognition 107:105–36.Google Scholar

Wills, A. J. & Pothos, E. M. (2012) On the adequacy of current empirical evaluations of formal models of categorization. Psychological Bulletin 138:102–25.Google Scholar