Hostname: page-component-6bf8c574d5-mggfc Total loading time: 0 Render date: 2025-02-20T23:28:34.896Z Has data issue: false hasContentIssue false

Smart Representations: Rationality and Evolution in a Richer Environment

Published online by Cambridge University Press:  01 January 2022

Rights & Permissions [Opens in a new window]

Abstract

Standard applications of evolutionary game theory look at a single game and focus on the evolution of behavior for that game alone. Instead, this article uses tools from evolutionary game theory to study the competition between choice mechanisms in a rich and variable multigame environment. A choice mechanism is a way of subjectively representing a decision situation, paired with a method for choosing an act based on this subjective representation. We demonstrate the usefulness of this approach by a case study that shows how subjective representations in terms of regret that differ from the actual fitness can be evolutionarily advantageous.

Type
Research Article
Copyright
Copyright © The Philosophy of Science Association

1. Introduction

If agents deal with a rich and variable environment, they have to face many different choice situations. Standard evolutionary game models frequently simplify reality in at least two ways. First, the environment is represented as a fixed stage game; second, the focus of evolutionary selection is behavior for that stage game alone. In contrast, some argue for studying the evolutionary competition of general choice mechanisms in a rich and variable environment (e.g., Hammerstein and Stevens Reference Hammerstein, Stevens, Hammerstein and Stevens2012; Fawcett, Hamblin, and Giraldeau Reference Fawcett, Hamblin and Giraldeau2013; McNamara Reference McNamara2013). In response to this and adding to recent like-minded approaches, this article introduces a general meta-game model that conservatively extends the scope of evolutionary game theory to deal with evolutionary selection of choice mechanisms in variable environments (see also Harley Reference Harley1981; Bednar and Page Reference Bednar and Page2007; Rayo and Becker Reference Rayo and Becker2007; Zollman Reference Zollman2008; Skyrms and Zollman Reference Skyrms and Zollman2010; Zollman and Smead Reference Zollman and Smead2010; Smead and Zollman Reference Smead and Zollman2013; O’Connor Reference O’Connor2015).Footnote 1

A choice mechanism associates decision situations with action choices. A crucial part of a choice mechanism is the subjective representation of the decision situation, in particular the manner of forming preferences and beliefs about a possibly uncertain world. To show the usefulness of the meta-game approach, this article asks: which preference and belief representations are ecologically valuable and lead to high fitness? The evolution of preferences has been the subject of recent interest in theoretical economics (e.g., Dekel, Ely, and Ylankaya Reference Dekel, Ely and Ylankaya2007; Robson and Samuelson Reference Robson and Samuelson2011; Alger and Weibull Reference Alger and Weibull2013). Here, we argue that questions of preference evolution should take variability in uncertainty representation into account as well. We demonstrate that if agents have imprecise probabilistic beliefs (e.g., Levi Reference Levi1974; Gardenfors and Sahlin Reference Gardenfors and Sahlin1982; Walley Reference Walley1996), faithful and objective representations in terms of true evolutionary fitness can be outperformed by subjective (e.g., regret-based) preference representations that deviate from the true fitness that natural selection operates on.

The article is organized as follows. Section 2 sets the scene by reviewing different perspectives on rational choice. Section 3 introduces the meta-game approach. In doing so, it covers key notions such as choice mechanisms, decision rules and subjective representations, all with an eye toward the evolutionary application of section 4. Section 5 contains the main results for that application, and section 6 discusses some interesting extensions. Finally, section 7 concludes.

2. Rationality and Subjective Representations

The standard textbook definition of rationality in economics and decision theory traces back to the seminal work by de Finetti (Reference de Finetti1937), von Neumann and Morgenstern (Reference von Neumann and Morgenstern1944), and Savage (Reference Savage1954). It says that a choice is rational only if it maximizes (subjective) expected utility. Expected utility is subjective in the sense that it is a function of subjective beliefs and subjective preferences of the decision maker (DM). To wit, a choice can be rational, that is, the best choice from the DM’s point of view, even if based on peculiar beliefs and/or aberrant preferences.

If beliefs and preferences are subjective, there is room for rationalization or redescriptionism of observable behavior. For example, in the case of social decision making, including considerations of fairness allows us to describe as rational empirically observed behavior, such as in experimental Prisoner’s Dilemmas or public goods games, that might otherwise appear irrational (e.g., Fehr and Schmidt Reference Fehr and Schmidt1999; Charness and Rabin Reference Charness and Rabin2002).

The main objection to redescriptionism is that, without additional constraints, the notion of rationality is likely to collapse, as it seems possible to deem rational almost everything that is observed, given the freedom to adjust beliefs and preferences at will. Normativism therefore emphasizes that there are many ways in which ascriptions of beliefs and preferences should be constrained by normative considerations of rationality as well: for example, subjective beliefs should reflect objective chance where possible; subjective preferences should be oriented toward tracking objective fitness. For instance, profit maximization seems a necessary requirement for evolution in a competitive market because only firms behaving according to profit maximization will survive in the long run (e.g., Alchian Reference Alchian1950; Friedman Reference Friedman1953).

An alternative view on rationality of choice is adaptationism (e.g., Anderson Reference Anderson1991; Chater and Oaksford Reference Chater and Oaksford2000; Hagen et al. Reference Hagen, Hammerstein and Stevens2012). Adaptationism aims to explain rational behavior by appealing to evolutionary considerations: DMs have acquired choice mechanisms that have proved to be adaptive with respect to the variable environment where they have evolved. A choice mechanism can be a set of distinct heuristics (the DM’s adaptive toolbox) that have little in common (e.g., Gigerenzer and Goldstein Reference Gigerenzer and Goldstein1996; Tversky and Kahnemann Reference Tversky and Kahnemann1981; Scheibehenne, Rieskamp, and Wagenmakers Reference Scheibehenne, Rieskamp and Wagenmakers2013). But to closely relate to the literature on evolution of preferences and to the philosophical debate about the nature of rational choice, we here suggest thinking of a choice mechanism as a map from choice situations to action choices that includes an explicit level of subjective representation of the situation. Specifically, a subjective representation is a general way of forming preferences and beliefs about the choice situation. We are most interested in the question of which subjective representations, and which choice mechanisms in general, are better than others from an evolutionary point of view.

3. Choice Mechanisms and Meta-Games

We view a choice mechanism as the combination of three different things: a subjective utility (or preference), a subjective belief, and a decision rule. In general, the agent’s action choice will depend both on the agent’s utility at different possible outcomes of the choice situation and on the agent’s beliefs about the realization of these outcomes. The decision rule then combines the agent’s subjective utility and belief, and dictates how the agent should act: a decision rule is a function that associates an action choice with the agent’s utility and beliefs:

Decision Rule: Utility × Beliefs Actions.

The subjective utility of an agent can be formally expressed by a function u: W×A, where A stands for a (finite) set of actions available to the agent and W is a (finite) set of possible states of the world. There are many different ways to describe beliefs, but for concreteness of later applications we here assume that the agent’s beliefs are represented in terms of a (possibly singleton) convex compact set of probability functions ΓΔ(W) over the possible states of the world. Given a utility u and a belief Γ, examples of well-known decision rules from the literature that we will encounter later are:

  1. 1. Maxmin:

    a*(u,Γ)=argmaxaA minμΓwWu(w,a)μ(w),
  2. 2. Maximax:

    a*(u,Γ)=argmaxaA maxμΓwWu(w,a)μ(w),
  3. 3. Laplace rule:

    a*(u,Γ)=argmaxaAwW1|W|u(w,a),
  4. 4. Expected utility maximization (for Γ singleton):

    a*(u,Γ)=argmaxaAwWu(w,a)μ(w).

It is worth noticing that both maxmin and maximax boil down to expected utility maximization when the set Γ is a singleton, and in turn expected utility maximization reduces to the Laplace rule when the belief μ is a uniform probability over the states.

As mentioned previously, for a choice mechanism to prescribe an action, the decision rule needs to be given a specific utility u and belief Γ as input. We call the pair (u, Γ) a subjective representation of the decision situation. In the following, we investigate the evolutionary fitness of general and systematic ways of forming such subjective representations across many different decision situations.

A fitness game is an interactive decision situation. For a given fitness game G=N,(Ai,πiG)iN, let us denote the evolutionary payoff, or fitness, of player i by the function πiG:ΠiNAi, where Ai is player i’s (finite) set of actions. For simplicity of exposition we assume that all games that are played are symmetric two-player games where N:={1,2}, A1=A2, and π1G(a1,a2)=π2G(a1,a2)=πG(a,a).Footnote 2 The fitness of a choice mechanism c with decision rule ac* and subjective representation (uc, Γc) is measured in terms of the expected evolutionary payoff of c. Formally, the fitness of choice mechanism c against choice mechanism c′ in a symmetric two-player game G={1,2},A,πG is given by:Footnote 3

FG(c,c)=πG(ac*(ucG,Γc),ac*(ucG,Γc)).

Given the game-theoretic setting, the subjective utility ucG is now a function ucG:A×A, and the subjective belief Γc is a set of probability functions over the coplayer’s actions, ΓcΔ(A).

Going beyond a single fixed fitness game, we consider a class of possible games. For concreteness, let G be a class of two-player symmetric games, together with a probability measure PG(G′) for the occurrence probability of game GG. Intuitively, the probability PG encodes the statistical properties of the environment. A meta-game is then a tuple MG=CM,G,PG,F, where CM is a set of choice mechanisms, G is a class of possible games, PG(G′) is the probability of game G′ to occur, and F:CM×CM is the (meta-)fitness function, defined as:

(1)F(c,c)=FG(c,c)dPG(G).

Hence, F(c, c′) determines the evolutionary payoff of choice mechanism c against c′ in the meta-game. The set CM can be thought of as the set of choice mechanisms that are present within a given population playing the games from the class G. Consequently, it is possible to compute the average fitness of c against the population, that is given by:

(2)F(c)=F(c,c)dPc(c)=FG(c,c)dPc(c)dPG(G),

where Pc(c′) is the probability of encountering a coplayer with choice mechanism c′.

Meta-games are then abstract models for the evolutionary competition between choice mechanisms in interactive decision-making contexts. Standard notions of evolutionary game theory apply to meta-games as well. For example, a choice mechanism c is a strict Nash equilibrium if F(c,c)>F(c,c) for all c′; it is evolutionarily stable if for all c′ either (i) F(c,c)>F(c,c) or (ii) F(c,c)=F(c,c) and F(c,c)>F(c,c); it is neutrally stable if for all c′ either (i) F(c,c)>F(c,c) or (ii) F(c,c)=F(c,c) and F(c,c)F(c,c) (Maynard Smith Reference Maynard Smith1982). Similarly, evolutionary dynamics can be applied to meta-games. Later we will also turn toward a dynamical analysis in terms of replicator dynamics (Taylor and Jonker Reference Taylor and Jonker1978) and replicator mutator dynamics (e.g., Nowak Reference Nowak2006).

4. Evolution of Preferences

To demonstrate the usefulness of a meta-game approach, we compare a selection of general ways of forming belief and preference representations against each other. As for subjective preferences, consider initially:

  1. 1. the objective utility, defined by: for all GG,

    objG(a,a)=πG(a,a);
  2. 2. the regret, defined by: for all GG,

    regG(a,a)=πG(a,a)maxaAπG(a,a).

As motivation for this comparison, it is to be stressed that regret minimization is one of the main alternatives to utility (or value) maximization in the literature on decision criteria (see also Bleichrodt and Wakker Reference Bleichrodt and Wakker2015). For a start, the subjective beliefs that we take into consideration are also two:

  1. 1. prc: a precise uniform belief μ¯ such that μ¯(a)=1/|A| for all aA;

  2. 2. imp: a maximally imprecise belief Γ¯=Δ(A).

Although a thorough discussion of this issue goes beyond the scope of this work, let us say that these two kinds of belief underlie two different and alternative views on uncertainty. Faced with uncertain events, a strict Bayesian will always form a precise belief, specified by a single probability μ. In the absence of any information about future uncertain events, the Bayesian would mostly invoke the principle of insufficient reason and accordingly choose a uniform probability over the possible outcomes. In contrast, others have argued against the obligation of representing a belief by means of a single probability measure, opposite to the Bayesian paradigm (e.g., Gilboa and Marinacci Reference Gilboa and Marinacci2013). They argue instead in favor of a more encompassing account, according to which uncertainty can be unmeasurable and represented by a (convex and compact) set of probabilities (e.g., Gilboa and Schmeidler Reference Gilboa and Schmeidler1989). This line of thought has its origin in decision theory, motivated by Ellsberg’s famous paradoxes (Ellsberg Reference Ellsberg1961), and appears extremely relevant in game-theoretic contexts too. Indeed, in a recent paper Battigalli et al. (Reference Battigalli2015, 646) write: “Such [unmeasurable] uncertainty is inherent in situations of strategic interaction. This is quite obvious when such situations have been faced only a few times.”

In evolutionary game theory, for instance, players obviously face uncertainty about the composition of the population that they are part of, and consequently about the (type of) coplayer that they are randomly paired with at each round and about the coplayer’s action. In case of complete lack of information about the composition of the population, a non-Bayesian player would thus entertain maximal unmeasurable uncertainty, that is, a maximally imprecise belief.Footnote 4 As already anticipated, we will see that the way agents form beliefs, and the possibility of holding imprecise beliefs in particular, can have a fundamental impact on their evolutionary success.

As for the decision rule, we assume that players use the maxmin rule. This is in line with many representation results of decision making under unmeasurable uncertainty (e.g., Gilboa and Schmeidler Reference Gilboa and Schmeidler1989; Ghirardato and Marinacci Reference Ghirardato and Marinacci2002) and seems corroborated by empirical findings too. Ellsberg’s paradoxes are prominent examples (Ellsberg Reference Ellsberg1961), and evidence from experimental literature suggests that agents are generally averse to unmeasurable uncertainty (e.g., Trautmann and Kuilen Reference Trautmann, van de Kuilen, Keren and Wu2016).

Finally, note that when the maxmin rule acts on subjective representations of type (obj, imp), that is, objective preferences and imprecise beliefs, the generated behavior corresponds to the classic maxmin strategy (von Neumann and Morgenstern Reference von Neumann and Morgenstern1944). When the maxmin rule acts on subjective representation (reg, imp), the agent’s behavior is known as regret minimization.Footnote 5 Two facts follow from these observations. The first is related to our focus on different types of uncertainty that players may entertain.

Fact 1 For any precise (Bayesian) belief μ, maximization of expected (objective) utility based on μ and minimization of expected regret based on μ are behaviorally equivalent.

The second fact highlights another behavioral equivalence, which we will make use of shortly in the following section.

Fact 2 In the class of 2×2 symmetric games, the acts selected by the Laplace rule are exactly the acts selected by regret minimization.

Here is a simple example that shows these choice mechanisms in action. Consider the coordination fitness game G depicted in figure 1a. Since the game is symmetric, it suffices to specify the evolutionary payoffs for the row player. Figure 1a also represents the objective utility objG, since objG=πG by definition, whereas figure 1b pictures the representation of G in terms of regret-based utilities. While classic maxmin is indifferent between I and II (fig. 1a), regret minimization uniquely selects II (fig. 1b).

Figure 1. A coordination game (left) and the associated regret representation (right).

5. Results

5.1. Simulation Results

Since for now we keep the decision rule fixed to maxmin, a player’s choice mechanism will only depend on the player’s subjective representation (u, Γ). For brevity, from now on we will refer to the pair (u, Γ), like (reg, imp) or (obj, prc), as the type of the player. Sometimes we will also distinguish types by referring to the subjective utility only, for instance (reg, imp) and (reg, prc) are regret types.

As observed earlier, meta-games factor in statistical properties of the environment. For particular empirical purposes, one could consult a specific class of games G with appropriate, maybe empirically informed probability PG in order to match the natural environment of a given population. For our present purposes, let G be a set of symmetric two-player fitness games with two acts for a start. Each game GG is then individuated solely by its payoff function, that is, by a quadruple of numbers G=(a,b,c,d). As for the occurrence probability PG(G) of game G, we imagine that the values a, b, c, d are independently and identically distributed (i.i.d.) random variables sampled from the set {0, …, 10} according to uniform probability PV. Using Monte Carlo simulations, we can then approximate the values of equation (1) to construct meta-game payoffs. Results based on 100,000 randomly sampled games are given in table 1.Footnote 6

Table 1. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

reg, imp obj, imp reg, prc obj, prc
reg, imp 6.663 6.662 6.663 6.663
obj, imp 6.486 6.484 6.486 6.486
reg, prc 6.663 6.662 6.663 6.663
obj, prc 6.663 6.662 6.663 6.663

Simulation results obviously reflect fact 2 in that all encounters in which types (reg, imp), (reg, prc), or (obj, prc) are substituted for one another yield identical results. More interestingly, table 1 shows that (obj, imp), the maxmin strategy, is strictly dominated by the three other types: in each column (i.e., for each type of coplayer), the maxmin strategy is strictly worse than any of the three competitors. This has a number of interesting consequences.

If we restrict attention to subjective representations with imprecise beliefs only, then a monomorphic state in which every agent has regret-based preferences is the only evolutionarily stable state. More strongly, since (obj, imp) is strictly dominated by (reg, imp), we expect selection that is driven by (expected) fitness to invariably weed out maxmin players (obj, imp) in favor of (reg, imp), regret minimization. In terms of choice rules, this means that regret minimization is evolutionarily better than maxmin over the class of games considered. In terms of subjective preferences, it shows that players using the objective representation that directly looks at fitness (possibly money, or profit) are outperformed by nonveridical (regret) representations, when players’ beliefs are imprecise.

Next, if we look at the competition between all four types represented in table 1, (reg, imp) is no longer evolutionarily stable. Given behavioral equivalence (fact 2), types (reg, imp), (reg, prc), and (obj, prc) are all neutrally stable (Maynard Smith Reference Maynard Smith1982). But since (obj, imp) is strictly dominated and so disfavored by fitness-based selection, we are still drawn to conclude that maxmin behavior is weeded out in favor of a population with a random distribution of the remaining three types.

Simulation results of the (discrete time) replicator dynamics (Taylor and Jonker Reference Taylor and Jonker1978) indeed show that random initial population configurations are attracted to states with only three player types: (reg, imp), (reg, prc), and (obj, prc). The relative proportions of these depend on the initial shares in the population. This variability fully disappears if we add a small mutation rate to the dynamics. Take a fixed, small mutation rate ε for the probability that a player’s subjective utility or her subjective belief changes to another utility or belief. The probability that a player’s subjective representation randomly mutates into a completely different representation with altogether different utility and belief would then be ε 2. With these assumptions about “component-wise mutations,” numerical simulations of the (discrete time) replicator mutator dynamics (Nowak Reference Nowak2006) show that already for very small mutation rates almost all initial population states converge to a single fixed point in which the majority of players have regret-based utility. For instance, with ε=0.001, almost all initial populations are attracted to a final distribution with proportions:

(reg, imp) (obj, imp) (reg, prc) (obj, prc)
.289 .021 .398 .289

What this suggests is that, if biological evolution selects behavior-generating mechanisms, not behavior as such, it need not be the case that behaviorally equivalent mechanisms are treated equally all the while. If mutation probabilities are a function of individual components, it can be the case that certain components of such behavior-generating mechanisms are more strongly favored by a process of random mutation and selection. This is exactly the case with regret-based preferences. Since regret-based preferences are much better in connection with imprecise beliefs than veridical preferences are, the proportion of expected regret minimizers, (reg, prc), in the attracting state is substantially higher than that of expected utility maximizers, (obj, prc), even though these types are behaviorally equivalent.

5.2. Analytical Results

Results based on the single meta-game in table 1 are not fully general and possibly spoiled by random fluctuations in the sampling procedure. Fortunately, for the case of 2×2 symmetric games, the main result that maxmin types (obj, imp) are strictly dominated by regret minimizers can also be shown analytically for considerably general conditions.

Proposition 1 Let G be the class of 2×2 symmetric games G=(a,b,c,d) generated by i.i.d. sampling a, b, c, d from a set of values with at least three elements in the support. Then, (reg, imp) strictly dominates (obj, imp) in the resulting meta-game.

Proof All proofs are in the appendix.

Corollary 1 Let G be as in proposition 1. If we only consider imprecise belief types, (obj, imp) and (reg, imp), then the unique evolutionarily stable state is a monomorphic population of (reg, imp) players.

The result shows that there is support for the main conceptual point that we wanted to make: objective preference representations are not necessarily favored by natural selection; objective preferences are outperformed by nonveridical regret preferences if agents have imprecise beliefs. This tells us that the main conclusions drawn in the previous section based on the approximated meta-game of table 1 hold more generally for arbitrary 2×2 symmetric games with i.i.d. sampled payoffs.

This result presupposes at least occasional imprecise beliefs. The assumed imprecise beliefs do not need to be maximally uncertain, however. Let the uncertainty held by a player be defined by a convex compact set of probabilities [s,t]Δ(A) over the coplayer’s actions, where s is the lower probability and t is the upper probability of action II. We can then prove the following proposition, which is the analogue of proposition 1 for any possible (not necessarily maximal) degree of uncertainty [s, t], with st. There is only one difference: we are now going to require i.i.d. drawing of a continuous random variable with uniform distribution. This is due to the fact that, for arbitrarily small intervals [s, t], objective players (obj, [s, t]) and regret players (reg, [s, t]) can behave as holding a unique probability measure (precise belief) if the underlying payoff space is not dense. The reason for this technical requirement will become clearer from the proof.

Proposition 2 Let G be the class of symmetric 2×2 games generated by i.i.d. drawing of a continuous random variable with uniform distribution over any set of values, and fix any imprecise belief [s, t]. Then the only evolutionarily stable state of a population with regret players (reg, [s, t]) and objective players (obj, [s, t]) is a monomorphic state of (reg, [s, t]) players.

This tells us that regret-based preferences can outperform objective preference representations when agents are also capable of learning or otherwise restricting their assumptions about the coplayer’s behavior as long as there is, at least on occasion, some imprecision in their beliefs. We will enlarge on the issue of belief formation after having covered some more relevant extensions in the next section.

6. Extensions

How do the basic results from the previous section carry over to richer models? Section 6.1 first introduces further conceptually interesting subjective representations that have been considered in the literature. Section 6.2 then addresses the case of symmetric two-player n×n games for n2. Finally, section 6.3 ends with a brief comparison to the case of solitary decision making.

6.1. More Preference Types

The space of possible preference types is enormous, and we have only compared regret and objective types so far. Let us now look at two other types of subjective preferences that have been investigated, especially in behavioral economics and in evolutionary game theory. A famous example is the altruistic preference (e.g., Becker Reference Becker1976; Bester and Güth Reference Bester and Güth1998), summoned to explain the possibility of altruistic behavior. At the other end of the spectrum, the competitive preference is located. The two subjective utilities are defined as follows:

  1. 1. Altruistic utility: for all GG, altG(a,a)=πG(a,a)πG(a,a);

  2. 2. Competitive utility: for all GG, comG(a,a)=πG(a,a)πG(a,a).Footnote 7

Table 2 shows results of Monte Carlo simulations that approximate the expected fitness in the relevant meta-game with all the subjective representations considered so far. These results confirm basic intuitions about altruistic and competitive types: everybody would like to have an altruistic coplayer and nobody would like to play against a competitive player. Perhaps more surprisingly, (alt, imp) comes up strictly dominated by (com, imp), but competitive types themselves are worse off against all types except against maxmin players (obj, imp) than any of the behaviorally equivalent types (reg, imp), (obj, prc), and (reg, prc). It is thus easy to see that the previous results still obtain for the larger meta-game in table 2: (reg, imp), (obj, prc), and (reg, prc) are still neutrally stable; simulation runs of the (discrete-time) replicator dynamics on the 8×8 meta-game from table 2 end up in population states consisting of only these three types in variable proportion. In sum, the presence of other subjective representations, such as those based on altruistic or competitive utilities, does not undermine, but rather strengthens, our previous results.

Table 2. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

(reg, imp) (obj, imp) (com, imp) (alt, imp) (reg, prc) (obj, prc) (com, prc) (alt, prc)
(reg, imp) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(obj, imp) 6.486 6.484 6.088 6.703 6.486 6.486 6.088 6.875
(com, imp) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149
(alt, imp) 5.949 5.722 5.326 6.396 5.949 5.949 5.326 6.568
(reg, prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(obj, prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(com, prc) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149
(alt, prc) 6.331 5.893 5.497 6.566 6.331 6.331 5.497 7.152

6.2. More Actions

Results from section 5 relied heavily on fact 2, which is no longer true when we look at arbitrary n×n games. Table 3 gives approximations of expected fitness in the class of n×n symmetric games. Concretely, the numbers in table 3 are averages of evolutionary payoffs obtained in 100,000 randomly sampled symmetric games, where each fitness game G was sampled by first picking a number of acts nG{2,10} uniformly at random, and then filling the necessary nG×nG payoff matrix with i.i.d. sampled numbers, as before.

Table 3. Average Evolutionary Fitness for 100,000 Randomly Generated n × n Symmetric Games with n Randomly Drawn from {2, …, 10}

(reg, imp) (obj, imp) (com, imp) (alt, imp) (reg, prc) (obj, prc) (com, prc) (alt, prc)
(reg, imp) 6.567 6.570 5.650 6.992 6.564 6.564 5.593 7.409
(obj, imp) 6.476 6.483 5.896 6.818 6.484 6.484 5.850 7.124
(com, imp) 6.468 6.647 5.512 7.169 6.578 6.578 5.577 7.354
(alt, imp) 5.968 5.923 5.363 6.685 5.975 5.975 5.086 6.973
(reg, prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783
(obj, prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783
(com, prc) 6.529 6.680 5.445 7.276 6.542 6.542 5.521 7.440
(alt, prc) 6.450 6.337 5.772 6.978 6.457 6.457 5.479 7.500

The most important result is that the regret-minimizing type (reg, imp) is strictly dominated by (reg, prc) and by (obj, prc) in the meta-game from table 3. This means that while simple regret minimization can thrive in some evolutionary contexts, there are also contexts where it is demonstrably worse off. While this may be bad news for regret-minimizing types (reg, imp), it is not the case that regret types as such are weeded out by selection. Since, by fact 1, (reg, prc) and (obj, prc) are behaviorally equivalent in general, it remains that selection based on meta-games constructed from n×n games will still not eradicate regret preferences.

On the other hand, there are plenty of ways in which the basic insights from propositions 1 and 2 can make for situations in which evolution would favor regret types, even in n×n games. If, for example, the belief of a player is a trait that biological evolution has no bite on, but rather something that the particular choice situation would exogenously give us (possibly because of the different amount of information available in different choice situations), then regret-based preferences can again drive out veridical preferences altogether. For example, suppose that only preference representations compete and that agents’ beliefs are exogenously given, in such a way that both players hold precise (Bayesian) uniform beliefs with probability p and they both have maximally imprecise beliefs otherwise. This transforms the meta-game from table 3 into a simpler 4×4 meta-game in which the payoff obtained by a subjective preference is the weighted average over the payoffs of the subjective representations including that preference in table 3. Setting p=.98 for illustration, we get the meta-game in table 4. The only evolutionarily stable state of this meta-game is again a monomorphic population of regret types. Accordingly, all our simulation runs of the (discrete-time) replicator dynamics converge to monomorphic regret-type populations. The reason why regret-based utilities prosper is because they have a substantial fitness advantage when paired with imprecise beliefs (propositions 1 and 2). If unmeasurable uncertainty is exogenously given as something that happens to agents because of the information available in some choice situations, and even if that happens only very infrequently (i.e., for rather low p), regret preferences will outperform objective preferences, as well as competitive and altruistic preferences.

Table 4. Meta-game for the Evolutionary Competition between Subjective Utilities When Beliefs Are Exogenously Given (see Main Text)

reg obj com alt
reg 6.926 6.926 5.942 7.757
obj 6.924 6.924 5.948 7.751
com 6.566 6.570 5.481 7.434
alt 6.463 6.461 5.478 7.469

6.3. Solitary Decisions

To see how different choice mechanisms behave in evolutionary competition based on solitary decision making, we approximated, much in the spirit of meta-games, average accumulated fitness obtained in randomly generated solitary decision problems. For our purposes, a decision problem D=W,A,πD consists of a set of states of the world W, a set of acts A, and a payoff function πD:W×A. We generate arbitrary decision problems by selecting, uniformly at random, numbers of states and acts nwD,naD{2,,10} and then filling the payoff table, so to speak, by i.i.d. samples for each πD(w,a){0,10}. Unlike with two-player games, we need to also sample the actual state of the world, which we selected uniformly at random from the available states in the current decision problem. Accordingly, the fitness of choice mechanism c in decision problem D is given by:

FD(c)=πD(w,ac*(ucD,Γc))μ¯(w),

with μ¯(w)=1/nwD for all w. As subjective representations, we considered the original cast of four from table 1, since altruistic and competitive types are meaningless in solitary decision situations. As before, the relevant fitness measure, defined in equation (3), was approximated by Monte Carlo simulations, the results of which are given in table 5.

(3)F(c)=FD(c)dPD(D).

Facts 1 and 2 still apply: (reg, prc) and (obj, prc) are behaviorally equivalent in general, and (reg, imp) is behaviorally equivalent to the former two in decision problems with two states and two acts. This shows in the results from table 5 in that the averages for (reg, prc) and (obj, prc) are identical. But since we included decision problems with more acts and more states as well, the average for regret minimizers (reg, imp) is not identical to the one of (reg, prc) and (obj, prc). It is, in fact, lower, but again not as low as that of (obj, imp).

Table 5. Expected Fitness of Choice Mechanisms Approximated from 100,000 Simulated Solitary Decision Problems (see Main Text)

(reg, imp) (obj, imp) (reg, prc) (obj, prc)
6.318 6.237 6.661 6.661

This means that every relevant result we have seen about game situations is also borne out for solitary decisions. Evolutionary selection based on objective fitness will not select against regret preferences, as these are indistinguishable from veridical preferences when paired with precise beliefs. But when paired with imprecise beliefs, regret-based utilities outperform objective utilities. Consequently, if there is a chance, however small, that agents fall back on imprecise beliefs, evolution will actually positively select for nonveridical regret-based preferences.

6.4. Sophisticated Beliefs

Since one of our main purposes was to illustrate the usefulness of a meta-game approach by the case study of objective and regret preferences, we have partially neglected an important and interesting issue, namely, the evolution of ways of forming beliefs about coplayers’ behavior or the actual state of the world. For reasons of space we must, unfortunately, leave a deeper exploration of belief type evolution to another occasion. Two remarks are in order nonetheless. First, belief type evolution can be studied without conceptual hurdles in the meta-game framework, so that there is no principled argument against the main methodological contribution of this article. Second, our results regarding the comparison between regret and objective types remain to be informative, even if we allow agents to learn or reason strategically.Footnote 8 This is because we know from fact 1 that regret and objective preferences come up behaviorally equivalent when paired with precise probabilistic beliefs (given identical decision rule). This holds no matter what the content of that belief is. So, if learning, reasoning, or statistical knowledge about a recurrent situation can be brought to bear, this will not make evolution select against regret-based preferences. If, on the other hand, agents resort to imprecise beliefs at least occasionally (e.g., when they are unaware of the coplayer or her utilities or when strategic reasoning cannot reduce all uncertainty about the coplayer’s choice), then regret-based preferences can be favored by natural selection over objective preferences.

7. Conclusion

The assumption that players and decision makers maximize their (subjective) utility is central through all economic literature, and the maximization of actual (objective) payoffs is often justified by appealing to evolutionary arguments and natural selection. In contrast to the standard view, we showed the existence of player types with subjective utilities different from the actual evolutionary payoffs that can outperform types whose subjective utilities coincide with the evolutionary payoffs. Here the claim is not that regret preferences are the best on the market, but rather that utilities that perfectly mirror evolutionary fitness can be outclassed by subjective utilities that differ from the objective fitness. While the literature on evolution of preferences has focused on fixed games, we have adopted a more general approach here. We suggested that attention to “meta-games” is crucial, because what may be a good subjective representation in one type of game (e.g., cooperative preferences in the Prisoner’s Dilemma) need not be generally beneficial.

Appendix Proofs

The proof of proposition 1 relies on a partition of G, and on some lemmas. For brevity, let us denote the regret minimizer (reg, imp) by R and the maximinimizer (obj, imp) by M. Following equation (1), let FG(X,Y) denote the expected payoff of choice mechanism X against choice mechanism Y on the possibly restricted class of fitness games G.

Proof of Proposition 1

By definition of strict dominance, we have to show that in the class G of symmetric 2×2 games with payoffs sampled from a set of i.i.d. values with at least three elements in the support, it holds that:

  • (i) FG(R,R)>FG(M,R);

  • (ii) FG(M,M)<FG(R,M).

To show this we use the following partition of G, based on payoffs parametrized as follows:

I II
I a b
II c d
  1. 1. Coordination games C: a>c and d>b;

  2. 2. Anticoordination games A: a<c and d<b;

  3. 3. Strong dominance games S: aut (a>c and b>d) aut (a<c and b<d);

  4. 4. Weak dominance games W: aut a=c aut b=d;

  5. 5. Boring games B: a=c and b=d.

Before proving the lemmas, it is convenient to fix some notation. Let us call x, y, z the three elements in the support, and without loss of generality suppose that x>y>z. We denote by C a coordination game in C with payoffs aC, bC, cC, and dC; similarly for games AA, SS, WW, and BB. Let us denote by IRC the event that a R-player plays action I in the game C; and similarly for action II, for player M, and for games A, S, W, and B. We first consider the case of i.i.d. sampling with finite support.

Lemma 1 R and M perform equally well in S and in B.

Proof By definition of regret minimization and maxmin it is easy to check that whenever in a game there is a strongly dominant action a $, then a $ is both the maxmin action and the regret-minimizing action. Then, for all the games in S, R chooses action a if and only if M chooses action a. Consequently, R and M always perform equally (well) in S. In the case of B it is trivial to see that all the players perform equally. QED

Lemma 2 In W, R strictly dominates M.

Proof Assume without loss of generality that b=d, and that a>c. There are two cases that we have to check: (i) c<b=d and (ii) cb=d. In the first case it is easy to see that R and M perform equally: act I is the choice of both R and M. In the case of (ii) instead we have that I is the regret-minimizing action, whereas both actions have the same minimum and M plays ((1/2)I;(1/2)II), since both I and II maximize the minimal payoff. Consider now a population of R and M playing games from the class W. Whenever (i) is the case R and M perform equally well. But suppose WW and (ii) is the case. Then, πW(R,R)=a>(1/2)a+(1/2)c=πW(M,R), whereas

πW(M,M)=14a+14b+14c+14d<12a+12b=πW(R,M).

Hence, we have that in general FW(R,R)>FW(M,R), and FW(M,M)<FW(R,M). QED

Since it is not difficult to see that both (R, R) and (M, M) are strict Nash equilibria in C, and that (R, R) and (M, M) are not Nash equilibria in A, the main part of the proof will be to show that R strictly dominates C in the class CA, that is:

  • (i′) FCA(R,R)>FCA(M,R),

  • (ii′) FCA(M,M)<FCA(R,M).

This part needs some more lemmas to be proven, but first we introduce the following bijective function ɸ between coordination and anticoordination games.

Definition 3 (ɸ) The permutation ɸ(a,b,c,d)=(c,d,a,b) defines a bijective function ɸ:CA that for each coordination game CC with payoffs (aC, bC, cC, dC) gives the anticoordination game AA with payoffs (aA,bA,cA,dA)=(cC,dC,aC,bC). Essentially, ɸ swaps rows in the payoff matrix.

Lemma 4 Occurrence probability of C equals that of ɸ (C): P(ɸ(C)=P(C).

Proof By definition, each game C(aC, bC, cC, dC) is such that aC>cC and dC>bC, and each game A(aA, bA, cA, dA) is such that aA<cA and dA<bA. Given that a, b, c, d are i.i.d. random variables and that a sequence of i.i.d. random variables is exchangeable, it is clear that the probability of (aC, bC, cC, dC) equals the probability of (cC, dC, aC, bC). Hence, P(ɸ(C))=P(C). QED

Lemma 5 Let P(E) be the probability of event E, for example, P(IRC) is the probability that a random R-player plays act I in coordination game C, which is either 0, .5, or 1. It then holds that:

  • P(IRC)=P(IIRɸ(C)), and P(IIRC)=P(IRɸ(C));

  • P(IMC)=P(IIMɸ(C)), and P(IIMC)=P(IMɸ(C)).

Proof It is easy to check that if bCdC>cCaC, an R-player plays action I in C; that if bCdC<cCaC, R plays II; and that if bCdC=cCaC, an R-player is indifferent between I and II in C, and so randomizes with ((1/2)I;(1/2)II). Similarly, if aAcA>dAbA, an R-player plays action I in A; if aAcA=dAbA, R plays II; and if aAcA=dAbA, an R-player is indifferent between I and II in A, and randomizes with ((1/2)I;(1/2)II). Consequently, if bCdC>cCaC, then P(IRC)=1, and by definition of ɸ we have P(IIRɸ(C))=1. Likewise, if bCdC<cCaC, then P(IIRC)=1=P(IRɸ(C)); and if bCdC=cCaC, then P(IRC)=P(IIRC)=1/2=P(II(C))=P(I(C)).

In the same way, in coordination games we have that if bC>cC, an M-player plays I; if cC>bC, an M-player plays II; and if bC=cC, M is indifferent between I and II and plays ((1/2)I;(1/2)II). In anticoordination games instead, if aA>dA, M plays I; if aA=dA, M plays II; if aA=dA, M plays ((1/2)I;(1/2)II). By definition of ɸ: P(IMC)=1=P(IIMɸ(C)) if bC>cC; P(IIMC)=1=P(IMɸ(C)) if cC>bC; and P(IMC)=P(IIMC)=1/2=P(II(C))=P(I(C)) if bC=cC. QED

Lemma 6 It holds that:

  • aC>dC(IMCIRC);

  • aC=dCIMC=IRC.

  • aC<dC(IIMCIIRC);

Proof The event that R plays action I, IRC, with positive probability is the event that bCdCcCaC: if bCdC>cCaC, R plays I, and if bCdC=cCaC, R plays ((1/2)I;(1/2)II). Similarly, the event that IMC has positive occurrence is the event that bCcC: if bC>cC, M plays I, and if bC=cC, M plays ((1/2)I;(1/2)II). Then, IRC implies that bCdCcCaC, and IMC implies that bCcC. Moreover, on the assumption that aC>dC, it is easy to check that bCcC implies bCdC>cCaC. Hence, in any C with aC>dC it holds that IMC implies IRC, that is, aC>dC(IMCIRC). Instead, it is possible that aC>dC, bCdC>cCaC, and bC<cC hold simultaneously, so that IMCIRC. By a symmetric argument it can be shown that aC>dC(IIMCIIRC) too. Finally, when aC=dC it holds that: bCdC>cCaC iff bC>cC; bCdC<cCaC iff bC<cC; and bCdC=cCaC iff bC=cC. Hence, aC=dCIMC=IRC. QED

We are now ready to prove that FCA(R,R)>FCA(M,R). With notation like P(IRCIRC) denoting the probability that a random R-player plays I and another R-player plays I as well in game C, rewrite the inequality as:

CCP(C)[P(IRCIRC)·aC+P(IIRCIIRC)·dC+P(IRCIIRC)·bC+P(IIRCIRC)·cC]+AAP(A)[P(IRAIRA)·aA+P(IIRAIIRA)·dA+P(IRAIIRA)·bA+P(IIRAIRA)·cA]>CCP(C)[P(IRCIMC)·aC+P(IIRCIIMC)·dC+P(IRCIIMC)·cC+P(IIRCIMC)·bC]+AAP(A)[P(IRAIMA)·aA+P(IIRAIIMA)·dA+P(IRAIIMA)·cA+P(IIRAIMA)·bA].

By lemma 4 and lemma 5, we can express everything in terms of C only:

CP(C)[P(IRCIRC)·aC+P(IIRCIIRC)·dC+P(IRCIIRC)·bC+P(IIRCIRC)·cC+P(IIRCIIRC)·cC+P(IRCIRC)·bC+P(IIRCIRC)·dC+P(IRCIIRC)·aC]>CP(C)2[P(IRCIMC)·aC+P(IIRCIIMC)·dC+P(IRCIIMC)·cC+P(IIRCIMC)·bC+P(IIRCIIMC)·cC+P(IRCIMC)·bC+P(IIRCIMC)·aC+P(IRCIIMC)·dC]

This simplifies to:

CP(C)[aC·(P(IRCIRC)+P(IRCIIRC))+bC·(P(IRCIIRC)+P(IRCIRC))+cC·(P(IIRCIRC)+P(IIRCIIRC))+dC·(P(IIRCIIRC)+P(IIRCIRC))]>CP(C)[aC·(P(IRCIMC)+P(IIRCIMC))+bC·(P(IIRCIMC)+P(IRCIMC))+cC·(P(IRCIIMC)+P(IIRCIIMC))+dC·(P(IIRCIIMC)+P(IRCIIMC))].

Now let us split into a>d and a<d, and consider a>d first. Notice that, by lemma 6, the case a=d is irrelevant in order to discriminate between R and M. If a>d, by lemma 6 we can eliminate the cases where R plays II and M plays I:

Ca>dP(C)[aC·(P(IRCIRC)+P(IRCIIRC)P(IRCIMC))+bC·(P(IRCIIRC)+P(IRCIRC)P(IRCIMC))+cC·(P(IIRCIRC)+P(IIRCIIRC)P(IRCIIMC)P(IIRCIIMC))+dC·(P(IIRCIIRC)+P(IIRCIRC)P(IIRCIIMC)P(IRCIIMC))]>0.

We now distinguish between two cases: (1) ac=db and (2) acdb. Notice that P(IRCIIRC)0 if and only if case (1) obtains, and that a>d and (1) imply IIMC. Then, from (1) we have:Footnote 9

Ca>dP(C)[aC·(14+14)+bC·(14+14)+cC·(14+141212)+dC·(14+141212)]>0.

Since we have assumed ac=db, the last inequality is not satisfied. We have instead:

Ca>dP(C)[12aC+12bC12cC12dC]=0.

This means that where aC>dC and where (1) is the case, R and M are equally fit. This changes when we turn to (2). In that case, since aC>dC(IMCIRC) by lemma 6, we have that P(IRCIRC)P(IRCIMC)=P(IRCIIMC). Moreover, when aC>dC, bCcC implies bCdC>cCaC (see lemma 6). Consequently, when M plays either I or ((1/2)I;(1/2)II), R always plays I. Hence, whenever aC>dC and (2) obtain, it also holds that P(IIRCIIMC)=P(IIRCIIRC). In this case we can simplify to:

Ca>dP(C)[P(IRCIIMC)·(aC+bCcCdC)]>0.

We know that IRC implies that aCcCdCbC. Since we have assumed that aCcCdCbC, we have that aCcC>dCbC. Hence, the inequality

Ca>dP(C)[P(IRCIIMC)·(aC+bCcCdC)]>0

is satisfied. So, when aC>dC, R strictly dominates M. Symmetrically, from a<d and by distinguishing between the two cases (1) and (2) as before, in the end we get:

  • (1) Ca<dP(C)[12aC12bC+12cC+12dC]=0; and

  • (2) Ca<dP(C)[P(IIRCIMC)·(aCbC+cC+dC)]>0.

Hence, we can conclude that R strictly dominates M in the class CA.

It remains to be shown that FCA(M,M)<FCA(R,M). As before, spell this out as:

CP(C)[P(IMCIMC)·aC+P(IIMCIIMC)·dC+P(IMCIIMC)·bC+P(IIMCIMC)·cC]+AP(A)[P(IMAIMA)·aA+P(IIMAIIMA)·dA+P(IMAIIMA)·bA+P(IIMAIMA)·cA]<CP(C)[P(IRCIMC)·aC+P(IIRCIIMC)·dC+P(IRCIIMC)·bC+P(IIRCIMC)·cC]+AP(A)[P(IRAIMA)·aA+P(IIRAIIMA)·dA+P(IRAIIMA)·bA+P(IIRAIMA)·cA].

Similarly to the above derivation, we consider a>d first, and we now distinguish between (1) b=c, (2) b>c, and (3) b<c. Notice that either (1) or (2), together with a>d, implies IRC. Then we obtain:Footnote 10

  • (1)

    Ca>dP(C)[12aC12bC+12cC+12dC]<0;
  • (2) Ca>dP(C)[aC·(P(IMCIMC)P(IRCIMC))+bC·(P(IMCIMC)P(IRCIMC))]=0;

  • (3) Ca>dP(C)[aC·(P(IRCIIMC))+bC·(P(IRCIIMC))+cC·(P(IIMCIIMC)P(IIRCIIMC))+dC·(P(IIMCIIMC)P(IIRCIIMC))]0.

When a<d, the derivation proceeds symmetrically and we get:

  • (1)

    Ca<dP(C)[12aC+12bC12cC12dC]<0;
  • (2) Ca<dP(C)[aC·(P(IMCIMC)P(IRCIMC))+bC·(P(IMCIMC)P(IRCIMC))+cC·(P(IIRCIMC))+dC·(P(IIRCIMC))]0;

  • (3) Ca<dP(C)[cC·(P(IIMCIIMC)P(IIRCIIMC))+dC·(P(IIMCIIMC)P(IIRCIIMC))]=0.

Finally, we can conclude that FCA(M,M)<FCA(R,M).

When we have i.i.d. sampling with continuous support, games in W and B never occur, and the proof for the other cases reduces to the proof of proposition 2 for s=0 and t=1.

Proof of Proposition 2

As shown in figure A1a, given a game (a, b, c, d), action I corresponds to the line a+(ba)x, while action II corresponds to the line c+(dc)x. The slope of action I is then (ba), and the slope of action II is (dc). Action I is steeper than action II if |ba|>|dc|, and action II is steeper than action I if the reverse of the last inequality holds. Given a belief Γ=[s,t], let us define

(4)a(1s)a+sb=a+s(ba)b(1t)a+tb=a+t(ba)c(1s)c+sd=c+s(dc)d(1t)c+td=c+t(dc).

Next, type R is indifferent between the two acts if

caca+bds=tcaca+bd

and prefers I over II if

caca+bds<tcaca+bd.

For succinctness, let us abbreviate Zca+bd. Whenever |dc|>|ba| and [(ca)/Z]st[(ca)/Z], it is the case that a>d, so that M prefers I over II. Indeed, when s=t=(ca)/Z, we have that a=d=(cbad)/Z. When we enlarge the interval Γ by moving s to the left of (ca)/Z and t to the right of (ca)/Z by the same extent, such that [(ca)/Z]s=t[(ca)/Z], we get that a>d, since a′ moved by

(ab)(caZs)

while d′ moved by

(dc)(tcaZ).

Consequently, in a game where action II is steeper than action I, the only possible way in which the two types can differ is M playing I and R playing II, since we will never observe M playing II and R playing I in such games. By a similar argument, when action I is steeper than action II, the only possible way in which the two types can differ is M playing II and R playing I. Hence, whenever one of the actions is steeper than the other, the two types can differ in only one of the two possible ways. Finally, when action I and action II are equally steep, |dc|=|ba|, then one type strictly prefers one action if and only if the other type does too. We say that a game is relevant if the two types play different actions.

Figure A1. Examples of coordination game C and corresponding anticoordination game ψ(C).

Since the two cases are symmetric, consider the case |dc|=|ba|. Suppose that c and d have been drawn such that c<d. For |dc|=|ba| to hold, the game has to be a coordination game, otherwise we would have b=d and c=a, so |dc|<|ba|.

If the game is a coordination game, the two types choose differently if and only if

s<caZ<s+t2

and

b>c.

Consider all the points P(s,(s+t)/2). Each of these points can be expressed as a linear combination:

ks+(nk)tn,

for k>(n/2). By simple algebra, for each point in P(s,[(s+t)/2]) and c<d, the point

c+ks+(nk)tn(dc)

expresses the expected value of action II for

P=ks+(nk)tn,

that is, it is the y-value of the line corresponding to action II when

x=ks+(nk)tn.

Then, given a two-dimensional point (x 0, y 0), the sheaf of lines passing through that point is defined by all the equations

yy0=m(xx0)

for m. (The vertical line m= is excluded from the sheaf, but it is not relevant for the proof.) Consequently, given the two-dimensional point

(ks+(nk)tn,c+ks+(nk)tn(dc)),

the sheaf of lines passing through that point is defined by the set of equations, for m:

y=m(xks+(nk)tn)+c+ks+(nk)tn(dc).

If, for each equation in the set, we define

(5)am(ks+(nk)tn)+c+ks+(nk)tn(dc)bm(1ks+(nk)tn)+c+ks+(nk)tn(dc),

then each equation corresponds to a possible game

I II
I a b
II c d

such that

caca+bd=ks+(nk)tn.

By algebraic computations, the condition dc>|ab| is equivalent to

|m|<dc.

Moreover, among the coordination games such that dc>|ab|, the relevant ones are those that also satisfy b>c, otherwise type M would not (strictly) prefer I over II. If we rewrite a′, b′, c′, d′ as in equation (4), then the inequality b>c reduces to

m(kkn)<dc.

By symmetric arguments, whenever c>d, the only possible relevant games for the same interval [s, t] are anticoordination games for which

s+t2<caZ<t,

and such that cd>|ab|, and that a>d. Similarly, for k<n/2, these correspond to all games

y=m(xks+(nk)tn)+c+ks+(nk)tn(dc)

such that

|m|<cd

and

m(knk)>dc.

Consider now the following bijective function ψ:CA between coordination and anticoordination games, that, for d>c, associates the coordination game

y=m(xks+(nk)tn)+c+ks+(nk)tn(dc)

with the anticoordination game

y=m(x(nk)s+ktn)+d+(nk)s+ktn(cd).

Essentially, ψ changes c to d, m to −m, and k to nk. In particular, note that ψ is a bijection that, for a fixed interval [s, t], sends relevant coordination games to relevant anticoordination games. Figure A1 gives a graphical example of the bijection.

We can then pair these two games and consider the average fitness in {C, ψ(C)} of (reg, [s, t]) against (reg, [s, t]), and then compare it to the fitness of (obj, [s, t]) against (reg, [s, t]), denoted respectively by M and R henceforth. In the pair of relevant games C and ψ(C), R strictly dominates M if

F{C,ψ(C)}(R,R)>F{C,ψ(C)}(M,R)

and

F{C,ψ(C)}(R,M)>F{C,ψ(C)}(M,M).

Consider the first inequality. Since both C and ψ(C) are relevant with respect to the interval [s, t], it implies that F{C,ψ(C)}(R,R)=d+c, and F{C,ψ(C)}(M,R)=b+ψ(b). Therefore, the first inequality is equivalent to

d+c2>b+ψ(b)2,

which can be spelled out as

d+c>m(1ks+(nk)tn)+c+ks+(nk)tn(dc)m(1(nk)s+ktn)+d+(nk)s+ktn(cd).

After some computations, the previous inequality boils down to

dc>m,

which we know is the case, since we have seen that the condition dc>|ab| is equivalent to dc>|m|.

Finally, from the previous argument it follows that, for any given interval [s, t], if we consider the set of all relevant coordination games, is denote it by Cr, and the set of all relevant anticoordination games Ar, then it holds that

F{CrAr}(R,R)>F{CrAr}(M,R).

Let us now check that the second inequality for R to strictly dominate M also holds. In {C, ψ(C)}, that is equivalent to

d+c>a+ψ(a),

which amounts to m<dc. As before, it then follows that

F{CrAr}(R,M)>F{CrAr}(M,M).

Therefore, R strictly dominates M.

Footnotes

1. Some of these contributions are closely related to ours. Bednar and Page (Reference Bednar and Page2007) use a multigame framework, composed of a fixed selection of six possible games, to study the emergence of different cultural behaviors, and model agents as finite-state automata playing games from the fixed selection. Zollman (Reference Zollman2008) explains seemingly “irrational” fair behavior in social dilemmas (like the Ultimatum game) by means of a model where agents have to play the Ultimatum game together with the Nash bargaining game, but they are constrained to choose the same strategy for both games. Finally, Rayo and Becker (Reference Rayo and Becker2007) consider, in a more decision-theoretic setting, what subjective utility function a cognitively limited agent should be endowed with in order to maximize her evolutionary fitness. Our framework can then be viewed as a generalization of those models, mainly in that here players do not necessarily have any specific cognitive limitations, and we allow for larger and possibly variable classes of games.

2. Since payoff functions are symmetric, we simply write πG(a, a′) for π1G(a1,a′2) and A≔A1=A2, as usual. However, notice that all definitions and results can be extended to more general cases.

3. Whenever a choice mechanism would not select a unique action, we assume that the player chooses one of the equally optimal actions at random. That is, FG(c,c′)=∑a∈ac*(ucG,Γc)∑a′∈ac′*(uc′G,Γc′)(1/|ac*(ucG,Γc)|)(1/|ac′*(uc′G,Γc′)|)πG(a,a′).

4. Such a radical uncertainty could ensue, for example, if agents have no conception of their coplayer or her preferences. Unsophisticated agents, as considered in evolutionary game theory, might be entirely unaware of the fact that they are engaged in social decision making (see Heifetz, Meier, and Schipper [Reference Heifetz, Meier and Schipper2013] for game-theoretic models of unawareness). It is therefore not ludicrous to consider radical uncertainty first and tend to more sophisticated ways of forming beliefs later (more on this below).

5. The notion of regret in decision theory dates back at least to the work by Savage (Reference Savage1951) and has later been developed by Bell (Reference Bell1982), Fishburn (Reference Fishburn1982), and Loomes and Sugden (Reference Loomes and Sugden1982) independently. Recently, Halpern and Pass (Reference Halpern and Pass2012) showed how the use of regret minimization can give solutions to game-theoretic puzzles (like the Traveller’s dilemma and the Centipede game) in a way that is closer to everyday intuition and empirical data. In this article the notion of regret defined earlier is the same as in Halpern and Pass (Reference Halpern and Pass2012).

6. Concretely, 100,000 games were sampled repeatedly by choosing independently four integers between 0 and 10 uniformly at random. For each game, the action choices of all four choice mechanisms were determined and payoffs from all pairwise encounters recorded. The number in each cell of table 1 is the average payoff for the choice mechanism listed in the row when matched with the choice mechanism in the column.

7. A more general formulation would be to define α-altruistic utility, for α∈[0,1], uαG(a,a′)=πG(a,a′)+απG(a′,a). Since we are not interested in the evolution of degrees of altruism, here we simply fix α=1. Analogously for α-competitive utilities too. Other possible generalizations could also take into account combinations of all these preferences with different decision rules. Maximax, minimax, and minimin, for example, would be possible rules for choice. Here we opted for maxmin because we were specifically interested in comparing maxmin and regret minimization, as these are two major alternatives for decision making.

8. Some research has recently been done along these lines. See, in particular, Mengel (Reference Mengel2012), Mohlin (Reference Mohlin2012), and Robalino and Robson (Reference Robalino and Robson2016).

9. Note that when we have only 3 elements in the support it is not guaranteed that case (1), together with a>d, may arise in a coordination game, whereas it is guaranteed that case (2), together with a>d, occurs with some positive probability. If we take for instance x=5, y=2, z=1, then case (1) cannot obtain, whereas if we take x=3, y=2, z=1, both (1) and (2) may obtain (a=3, b=1, c=2, d=2 for case (1), and a=3, b=1, c=2, d=2 for case (2)). Moreover, under the assumption that a>d, having three elements in the support is a necessary and sufficient condition for case (2) to have positive occurrence in a coordination game. As it will be clear in the following, a positive occurrence of case (2) only is enough for the theorem to hold.

10. Note that here, when we only have three elements in the support, case (2) is impossible, but cases (1) and (3) can occur with positive probability, and this enough for our purpose.

References

Alchian, Armen. 1950. “Uncertainty, Evolution and Economic Theory.” Journal of Political Economy 58:211–21.CrossRefGoogle Scholar
Alger, Ingela, and Weibull, Jörgen W.. 2013. “Homo Moralis: Preference Evolution under Incomplete Information and Assortative Matching.” Econometrica 81 (6): 22692302.Google Scholar
Anderson, John R. 1991. “Is Human Cognition Adaptive?Behavioral and Brain Sciences 14 (3RA): 471517.CrossRefGoogle Scholar
Battigalli, Pierpaolo et al. 2015. “Self-Confirming Equilibrium and Model Uncertainty.” American Economic Review 105 (2): 646–77.CrossRefGoogle Scholar
Becker, Gary S. 1976. “Altruism, Egoism, and Genetic Fitness: Economics and Sociobiology.” Journal of Economic Literature 14:817–26.Google Scholar
Bednar, Jenna, and Page, Scott. 2007. “Can Game(s) Theory Explain Culture? The Emergence of Cultural Behavior within Multiple Games.” Rationality and Society 19 (1): 6597.. http://rss.sagepub.com/content/19/1/65.full.pdf+html.CrossRefGoogle Scholar
Bell, David E. 1982. “Regret in Decision Making under Uncertainty.” Operations Research 30 (5): 961–81.CrossRefGoogle Scholar
Bester, Helmut, and Güth, Werner. 1998. “Is Altruism Evolutionarily Stable?Journal of Economic Behavior and Organization 34:193209.CrossRefGoogle Scholar
Bleichrodt, Han, and Wakker, Peter P.. 2015. “Regret Theory: A Bold Alternative to the Alternatives.” Economic Journal 125 (583): 493532.CrossRefGoogle Scholar
Charness, Gary, and Rabin, Matthew. 2002. “Understanding Social Preferences with Simple Tests.” Quarterly Journal of Economics 117 (3): 817–69.CrossRefGoogle Scholar
Chater, Nick, and Oaksford, Mike. 2000. “The Rational Analysis of Mind and Behavior.” Synthese 122:93131.CrossRefGoogle Scholar
de Finetti, Bruno. 1937. “La prevision: Ses lois logiques, ses sources subjectives.” Annales de l’Institute Henri Poincare 7:168.Google Scholar
Dekel, Eddie, Ely, Jeffrey C., and Ylankaya, Okan. 2007. “Evolution of Preferences.” Review of Economic Studies 74 (3): 685704.Google Scholar
Ellsberg, Daniel. 1961. “Risk, Ambiguity, and the Savage Axioms.” Quarterly Journal of Economics 75 (4): 643–69.CrossRefGoogle Scholar
Fawcett, Tim W., Hamblin, Steven, and Giraldeau, Luc-Alain. 2013. “Exposing the Behavioral Gambit: The Evolution of Learning and Decision Rules.” Behavioral Ecology 24 (1): 211.CrossRefGoogle Scholar
Fehr, Ernst, and Schmidt, Klaus M.. 1999. “A Theory of Fairness, Competition and Cooperation.” Quarterly Journal of Economics 114:817–68.CrossRefGoogle Scholar
Fishburn, Peter C. 1982. “Nontransitive Measurable Utility.” Journal of Mathematical Psychology 26 (1): 3167.CrossRefGoogle Scholar
Friedman, Milton. 1953. Essays in Positive Economics. Chicago: University of Chicago Press.Google Scholar
Gardenfors, Peter, and Sahlin, Nils-Eric. 1982. “Unreliable Probabilities, Risk Taking, and Decision Making.” Synthese 53:361–86.CrossRefGoogle Scholar
Ghirardato, Paolo, and Marinacci, Massimo. 2002. “Ambiguity Made Precise: A Comparative Foundation.” Journal of Economic Theory 102:251–89.CrossRefGoogle Scholar
Gigerenzer, Gerd, and Goldstein, Daniel G.. 1996. “Reasoning the Fast and Frugal Way: Models of Bounded Rationality.” Psychological Review 103 (4): 650–69.CrossRefGoogle ScholarPubMed
Gilboa, Itzhak, and Marinacci, Massimo. 2013. “Ambiguity and the Bayesian Paradigm.” In Advances in Economics and Econometrics, Vol. 1, ed. Daron Acemoglu, Manuel Arellano, and Eddie Dekel, 179–242. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Gilboa, Itzhak, and Schmeidler, David. 1989. “Maxmin Expected Utility with Non-unique Prior.” Journal of Mathematical Economics 18:141–53.CrossRefGoogle Scholar
Hagen, Edward H., et al. 2012. “Decision Making: What Can Evolution Do for Us?” In Evolution and the Mechanisms of Decision Making, ed. Hammerstein, Peter and Stevens, Jeffrey R., 97126. Cambridge, MA: MIT Press.Google Scholar
Halpern, Joseph Y., and Pass, Rafael. 2012. “Iterated Regret Minimization: A New Solution Concept.” Games and Economic Behavior 74:184207.CrossRefGoogle Scholar
Hammerstein, Peter, and Stevens, Jeffrey R.. 2012. “Six Reasons for Invoking Evolution in Decision Theory.” In Evolution and the Mechanisms of Decision Making, ed. Hammerstein, Peter and Stevens, Jeffrey R.. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Harley, Calvin B. 1981. “Learning the Evolutionarily Stable Strategy.” Journal of Theoretical Biology 89 (4): 611–33.CrossRefGoogle ScholarPubMed
Heifetz, Aviad, Meier, Martin, and Schipper, Burkhard C.. 2013. “Dynamic Unawareness and Rationalizable Behavior.” Games and Economic Behavior 81:5068.CrossRefGoogle Scholar
Levi, Isaac. 1974. “On Indeterminate Probabilities.” Journal of Philosophy 71 (13): 391418.CrossRefGoogle Scholar
Loomes, Graham, and Sugden, Robert. 1982. “Regret Theory: An Alternative Theory of Rational Choice under Uncertainty.” Economic Journal 92 (368): 805–24.CrossRefGoogle Scholar
Maynard Smith, John. 1982. Evolution and the Theory of Games. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
McNamara, John M. 2013. “Towards a Richer Evolutionary Game Theory.” Journal of the Royal Society Interface 10 (88): 19.CrossRefGoogle ScholarPubMed
Mengel, Friederike. 2012. “Learning across Games.” Games and Economic Behavior 74 (2): 601–19.CrossRefGoogle Scholar
Mohlin, Erik. 2012. “Evolution of Theories of Mind.” Games and Economic Behavior 75 (1): 299318.CrossRefGoogle Scholar
Nowak, Martin A. 2006. Evolutionary Dynamics: Exploring the Equations of Life. Cambridge, MA: Harvard University Press.CrossRefGoogle Scholar
O’Connor, Cailin. 2015. “Evolving to Generalize: Trading Precision for Speed.” British Journal for the Philosophy of Science. doi:10.1093/bjps/axv038.CrossRefGoogle Scholar
Rayo, Luis, and Becker, Gary S.. 2007. “Evolutionary Efficiency and Happiness.” Journal of Political Economy 115:302–37.CrossRefGoogle Scholar
Robalino, Nikolaus, and Robson, Arthur. 2016. “The Evolution of Strategic Sophistication.” American Economic Review 106 (4): 1046–72.CrossRefGoogle Scholar
Robson, Arthur, and Samuelson, Larry. 2011. “The Evolutionary Foundations of Preferences.” In Handbook of Social Economics, vol. 1A, ed. J. Benhabib, M. Jackson, and A. Bisin. Amsterdam: North-Holland.Google Scholar
Savage, Leonard Jimmie. 1951. “The Theory of Statistical Decision.” Journal of the American Statistical Association 46:5567.CrossRefGoogle Scholar
Savage, Leonard Jimmie 1954. The Foundations of Statistics. New York: Dover.Google Scholar
Scheibehenne, Benjamin, Rieskamp, Jörg, and Wagenmakers, Eric-Jan. 2013. “Testing the Adaptive Toolbox Models: A Bayesian Hierarchical Approach.” Philosophical Review 120 (1): 3964.Google ScholarPubMed
Skyrms, Brian, and Zollman, Kevin J. S.. 2010. “Evolutionary Considerations in the Framing of Social Norms.” Politics, Philosophy and Economics 9 (3): 265–73.. http://ppe.sagepub.com/content/9/3/265.full.pdf+html.CrossRefGoogle Scholar
Smead, Rory, and Zollman, Kevin J. S.. 2013. “The Stability of Strategic Plasticity.” Unpublished manuscript, Carnegie Mellon University.Google Scholar
Taylor, Peter D., and Jonker, Leo B.. 1978. “Evolutionary Stable Strategies and Game Dynamics.” Mathematical Bioscience 40 (1–2): 145–56.CrossRefGoogle Scholar
Trautmann, Stefan T., and van de Kuilen, Gijs. 2016. “Ambiguity Attitudes.” In The Wiley Blackwell Handbook of Judgment and Decision Making, ed. Keren, G. and Wu, G.. Chichester: Wiley-Blackwell.Google Scholar
Tversky, Amos, and Kahnemann, Daniel. 1981. “The Framing of Decisions and the Psychology of Choice.” Science 211 (4481): 453–58.CrossRefGoogle ScholarPubMed
von Neumann, John, and Morgenstern, Oskar. 1944. Theory of Games and Economic Behavior. Princeton, NJ: Princeton University Press.Google Scholar
Walley, Peter. 1996. “Inferences from Multinomial Data: Learning about a Bag of Marbles.” Journal of the Royal Statistical Society 58 (1): 357.Google Scholar
Zollman, Kevin J. S. 2008. “Explaining Fairness in Complex Environments.” Politics, Philosophy and Economics 7 (1): 8197.CrossRefGoogle Scholar
Zollman, Kevin J. S., and Smead, Rory. 2010. “Plasticity and Language: An Example of the Baldwin Effect?Philosophical Studies 147 (1): 721.CrossRefGoogle Scholar
Figure 0

Figure 1. A coordination game (left) and the associated regret representation (right).

Figure 1

Table 1. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

Figure 2

Table 2. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

Figure 3

Table 3. Average Evolutionary Fitness for 100,000 Randomly Generated n × n Symmetric Games with n Randomly Drawn from {2, …, 10}

Figure 4

Table 4. Meta-game for the Evolutionary Competition between Subjective Utilities When Beliefs Are Exogenously Given (see Main Text)

Figure 5

Table 5. Expected Fitness of Choice Mechanisms Approximated from 100,000 Simulated Solitary Decision Problems (see Main Text)

Figure 6

Figure A1. Examples of coordination game C and corresponding anticoordination game ψ(C).