1. Introduction
If agents deal with a rich and variable environment, they have to face many different choice situations. Standard evolutionary game models frequently simplify reality in at least two ways. First, the environment is represented as a fixed stage game; second, the focus of evolutionary selection is behavior for that stage game alone. In contrast, some argue for studying the evolutionary competition of general choice mechanisms in a rich and variable environment (e.g., Hammerstein and Stevens Reference Hammerstein, Stevens, Hammerstein and Stevens2012; Fawcett, Hamblin, and Giraldeau Reference Fawcett, Hamblin and Giraldeau2013; McNamara Reference McNamara2013). In response to this and adding to recent like-minded approaches, this article introduces a general meta-game model that conservatively extends the scope of evolutionary game theory to deal with evolutionary selection of choice mechanisms in variable environments (see also Harley Reference Harley1981; Bednar and Page Reference Bednar and Page2007; Rayo and Becker Reference Rayo and Becker2007; Zollman Reference Zollman2008; Skyrms and Zollman Reference Skyrms and Zollman2010; Zollman and Smead Reference Zollman and Smead2010; Smead and Zollman Reference Smead and Zollman2013; O’Connor Reference O’Connor2015).Footnote 1
A choice mechanism associates decision situations with action choices. A crucial part of a choice mechanism is the subjective representation of the decision situation, in particular the manner of forming preferences and beliefs about a possibly uncertain world. To show the usefulness of the meta-game approach, this article asks: which preference and belief representations are ecologically valuable and lead to high fitness? The evolution of preferences has been the subject of recent interest in theoretical economics (e.g., Dekel, Ely, and Ylankaya Reference Dekel, Ely and Ylankaya2007; Robson and Samuelson Reference Robson and Samuelson2011; Alger and Weibull Reference Alger and Weibull2013). Here, we argue that questions of preference evolution should take variability in uncertainty representation into account as well. We demonstrate that if agents have imprecise probabilistic beliefs (e.g., Levi Reference Levi1974; Gardenfors and Sahlin Reference Gardenfors and Sahlin1982; Walley Reference Walley1996), faithful and objective representations in terms of true evolutionary fitness can be outperformed by subjective (e.g., regret-based) preference representations that deviate from the true fitness that natural selection operates on.
The article is organized as follows. Section 2 sets the scene by reviewing different perspectives on rational choice. Section 3 introduces the meta-game approach. In doing so, it covers key notions such as choice mechanisms, decision rules and subjective representations, all with an eye toward the evolutionary application of section 4. Section 5 contains the main results for that application, and section 6 discusses some interesting extensions. Finally, section 7 concludes.
2. Rationality and Subjective Representations
The standard textbook definition of rationality in economics and decision theory traces back to the seminal work by de Finetti (Reference de Finetti1937), von Neumann and Morgenstern (Reference von Neumann and Morgenstern1944), and Savage (Reference Savage1954). It says that a choice is rational only if it maximizes (subjective) expected utility. Expected utility is subjective in the sense that it is a function of subjective beliefs and subjective preferences of the decision maker (DM). To wit, a choice can be rational, that is, the best choice from the DM’s point of view, even if based on peculiar beliefs and/or aberrant preferences.
If beliefs and preferences are subjective, there is room for rationalization or redescriptionism of observable behavior. For example, in the case of social decision making, including considerations of fairness allows us to describe as rational empirically observed behavior, such as in experimental Prisoner’s Dilemmas or public goods games, that might otherwise appear irrational (e.g., Fehr and Schmidt Reference Fehr and Schmidt1999; Charness and Rabin Reference Charness and Rabin2002).
The main objection to redescriptionism is that, without additional constraints, the notion of rationality is likely to collapse, as it seems possible to deem rational almost everything that is observed, given the freedom to adjust beliefs and preferences at will. Normativism therefore emphasizes that there are many ways in which ascriptions of beliefs and preferences should be constrained by normative considerations of rationality as well: for example, subjective beliefs should reflect objective chance where possible; subjective preferences should be oriented toward tracking objective fitness. For instance, profit maximization seems a necessary requirement for evolution in a competitive market because only firms behaving according to profit maximization will survive in the long run (e.g., Alchian Reference Alchian1950; Friedman Reference Friedman1953).
An alternative view on rationality of choice is adaptationism (e.g., Anderson Reference Anderson1991; Chater and Oaksford Reference Chater and Oaksford2000; Hagen et al. Reference Hagen, Hammerstein and Stevens2012). Adaptationism aims to explain rational behavior by appealing to evolutionary considerations: DMs have acquired choice mechanisms that have proved to be adaptive with respect to the variable environment where they have evolved. A choice mechanism can be a set of distinct heuristics (the DM’s adaptive toolbox) that have little in common (e.g., Gigerenzer and Goldstein Reference Gigerenzer and Goldstein1996; Tversky and Kahnemann Reference Tversky and Kahnemann1981; Scheibehenne, Rieskamp, and Wagenmakers Reference Scheibehenne, Rieskamp and Wagenmakers2013). But to closely relate to the literature on evolution of preferences and to the philosophical debate about the nature of rational choice, we here suggest thinking of a choice mechanism as a map from choice situations to action choices that includes an explicit level of subjective representation of the situation. Specifically, a subjective representation is a general way of forming preferences and beliefs about the choice situation. We are most interested in the question of which subjective representations, and which choice mechanisms in general, are better than others from an evolutionary point of view.
3. Choice Mechanisms and Meta-Games
We view a choice mechanism as the combination of three different things: a subjective utility (or preference), a subjective belief, and a decision rule. In general, the agent’s action choice will depend both on the agent’s utility at different possible outcomes of the choice situation and on the agent’s beliefs about the realization of these outcomes. The decision rule then combines the agent’s subjective utility and belief, and dictates how the agent should act: a decision rule is a function that associates an action choice with the agent’s utility and beliefs:

The subjective utility of an agent can be formally expressed by a function u: , where A stands for a (finite) set of actions available to the agent and W is a (finite) set of possible states of the world. There are many different ways to describe beliefs, but for concreteness of later applications we here assume that the agent’s beliefs are represented in terms of a (possibly singleton) convex compact set of probability functions
over the possible states of the world. Given a utility u and a belief Γ, examples of well-known decision rules from the literature that we will encounter later are:
1. Maxmin:
2. Maximax:
3. Laplace rule:
4. Expected utility maximization (for Γ singleton):
It is worth noticing that both maxmin and maximax boil down to expected utility maximization when the set Γ is a singleton, and in turn expected utility maximization reduces to the Laplace rule when the belief μ is a uniform probability over the states.
As mentioned previously, for a choice mechanism to prescribe an action, the decision rule needs to be given a specific utility u and belief Γ as input. We call the pair (u, Γ) a subjective representation of the decision situation. In the following, we investigate the evolutionary fitness of general and systematic ways of forming such subjective representations across many different decision situations.
A fitness game is an interactive decision situation. For a given fitness game , let us denote the evolutionary payoff, or fitness, of player i by the function
, where Ai is player i’s (finite) set of actions. For simplicity of exposition we assume that all games that are played are symmetric two-player games where
,
, and
.Footnote 2 The fitness of a choice mechanism c with decision rule
and subjective representation (uc, Γc) is measured in terms of the expected evolutionary payoff of c. Formally, the fitness of choice mechanism c against choice mechanism c′ in a symmetric two-player game
is given by:Footnote 3

Given the game-theoretic setting, the subjective utility is now a function
, and the subjective belief Γc is a set of probability functions over the coplayer’s actions,
.
Going beyond a single fixed fitness game, we consider a class of possible games. For concreteness, let be a class of two-player symmetric games, together with a probability measure PG(G′) for the occurrence probability of game
. Intuitively, the probability PG encodes the statistical properties of the environment. A meta-game is then a tuple
, where CM is a set of choice mechanisms,
is a class of possible games, PG(G′) is the probability of game G′ to occur, and
is the (meta-)fitness function, defined as:

Hence, F(c, c′) determines the evolutionary payoff of choice mechanism c against c′ in the meta-game. The set CM can be thought of as the set of choice mechanisms that are present within a given population playing the games from the class . Consequently, it is possible to compute the average fitness of c against the population, that is given by:

where Pc(c′) is the probability of encountering a coplayer with choice mechanism c′.
Meta-games are then abstract models for the evolutionary competition between choice mechanisms in interactive decision-making contexts. Standard notions of evolutionary game theory apply to meta-games as well. For example, a choice mechanism c is a strict Nash equilibrium if for all c′; it is evolutionarily stable if for all c′ either (i)
or (ii)
and
; it is neutrally stable if for all c′ either (i)
or (ii)
and
(Maynard Smith Reference Maynard Smith1982). Similarly, evolutionary dynamics can be applied to meta-games. Later we will also turn toward a dynamical analysis in terms of replicator dynamics (Taylor and Jonker Reference Taylor and Jonker1978) and replicator mutator dynamics (e.g., Nowak Reference Nowak2006).
4. Evolution of Preferences
To demonstrate the usefulness of a meta-game approach, we compare a selection of general ways of forming belief and preference representations against each other. As for subjective preferences, consider initially:
1. the objective utility, defined by: for all
,
2. the regret, defined by: for all
,
As motivation for this comparison, it is to be stressed that regret minimization is one of the main alternatives to utility (or value) maximization in the literature on decision criteria (see also Bleichrodt and Wakker Reference Bleichrodt and Wakker2015). For a start, the subjective beliefs that we take into consideration are also two:
1. prc: a precise uniform belief
such that
for all
;
2. imp: a maximally imprecise belief
.
Although a thorough discussion of this issue goes beyond the scope of this work, let us say that these two kinds of belief underlie two different and alternative views on uncertainty. Faced with uncertain events, a strict Bayesian will always form a precise belief, specified by a single probability μ. In the absence of any information about future uncertain events, the Bayesian would mostly invoke the principle of insufficient reason and accordingly choose a uniform probability over the possible outcomes. In contrast, others have argued against the obligation of representing a belief by means of a single probability measure, opposite to the Bayesian paradigm (e.g., Gilboa and Marinacci Reference Gilboa and Marinacci2013). They argue instead in favor of a more encompassing account, according to which uncertainty can be unmeasurable and represented by a (convex and compact) set of probabilities (e.g., Gilboa and Schmeidler Reference Gilboa and Schmeidler1989). This line of thought has its origin in decision theory, motivated by Ellsberg’s famous paradoxes (Ellsberg Reference Ellsberg1961), and appears extremely relevant in game-theoretic contexts too. Indeed, in a recent paper Battigalli et al. (Reference Battigalli2015, 646) write: “Such [unmeasurable] uncertainty is inherent in situations of strategic interaction. This is quite obvious when such situations have been faced only a few times.”
In evolutionary game theory, for instance, players obviously face uncertainty about the composition of the population that they are part of, and consequently about the (type of) coplayer that they are randomly paired with at each round and about the coplayer’s action. In case of complete lack of information about the composition of the population, a non-Bayesian player would thus entertain maximal unmeasurable uncertainty, that is, a maximally imprecise belief.Footnote 4 As already anticipated, we will see that the way agents form beliefs, and the possibility of holding imprecise beliefs in particular, can have a fundamental impact on their evolutionary success.
As for the decision rule, we assume that players use the maxmin rule. This is in line with many representation results of decision making under unmeasurable uncertainty (e.g., Gilboa and Schmeidler Reference Gilboa and Schmeidler1989; Ghirardato and Marinacci Reference Ghirardato and Marinacci2002) and seems corroborated by empirical findings too. Ellsberg’s paradoxes are prominent examples (Ellsberg Reference Ellsberg1961), and evidence from experimental literature suggests that agents are generally averse to unmeasurable uncertainty (e.g., Trautmann and Kuilen Reference Trautmann, van de Kuilen, Keren and Wu2016).
Finally, note that when the maxmin rule acts on subjective representations of type (obj, imp), that is, objective preferences and imprecise beliefs, the generated behavior corresponds to the classic maxmin strategy (von Neumann and Morgenstern Reference von Neumann and Morgenstern1944). When the maxmin rule acts on subjective representation (reg, imp), the agent’s behavior is known as regret minimization.Footnote 5 Two facts follow from these observations. The first is related to our focus on different types of uncertainty that players may entertain.
Fact 1 For any precise (Bayesian) belief μ, maximization of expected (objective) utility based on μ and minimization of expected regret based on μ are behaviorally equivalent.
The second fact highlights another behavioral equivalence, which we will make use of shortly in the following section.
Fact 2 In the class of symmetric games, the acts selected by the Laplace rule are exactly the acts selected by regret minimization.
Here is a simple example that shows these choice mechanisms in action. Consider the coordination fitness game G depicted in figure 1a. Since the game is symmetric, it suffices to specify the evolutionary payoffs for the row player. Figure 1a also represents the objective utility objG, since by definition, whereas figure 1b pictures the representation of G in terms of regret-based utilities. While classic maxmin is indifferent between I and II (fig. 1a), regret minimization uniquely selects II (fig. 1b).

Figure 1. A coordination game (left) and the associated regret representation (right).
5. Results
5.1. Simulation Results
Since for now we keep the decision rule fixed to maxmin, a player’s choice mechanism will only depend on the player’s subjective representation (u, Γ). For brevity, from now on we will refer to the pair (u, Γ), like (reg, imp) or (obj, prc), as the type of the player. Sometimes we will also distinguish types by referring to the subjective utility only, for instance (reg, imp) and (reg, prc) are regret types.
As observed earlier, meta-games factor in statistical properties of the environment. For particular empirical purposes, one could consult a specific class of games with appropriate, maybe empirically informed probability PG in order to match the natural environment of a given population. For our present purposes, let
be a set of symmetric two-player fitness games with two acts for a start. Each game
is then individuated solely by its payoff function, that is, by a quadruple of numbers
. As for the occurrence probability PG(G) of game G, we imagine that the values a, b, c, d are independently and identically distributed (i.i.d.) random variables sampled from the set {0, …, 10} according to uniform probability PV. Using Monte Carlo simulations, we can then approximate the values of equation (1) to construct meta-game payoffs. Results based on 100,000 randomly sampled games are given in table 1.Footnote 6
Table 1. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

reg, imp | obj, imp | reg, prc | obj, prc | |
---|---|---|---|---|
reg, imp | 6.663 | 6.662 | 6.663 | 6.663 |
obj, imp | 6.486 | 6.484 | 6.486 | 6.486 |
reg, prc | 6.663 | 6.662 | 6.663 | 6.663 |
obj, prc | 6.663 | 6.662 | 6.663 | 6.663 |
Simulation results obviously reflect fact 2 in that all encounters in which types (reg, imp), (reg, prc), or (obj, prc) are substituted for one another yield identical results. More interestingly, table 1 shows that (obj, imp), the maxmin strategy, is strictly dominated by the three other types: in each column (i.e., for each type of coplayer), the maxmin strategy is strictly worse than any of the three competitors. This has a number of interesting consequences.
If we restrict attention to subjective representations with imprecise beliefs only, then a monomorphic state in which every agent has regret-based preferences is the only evolutionarily stable state. More strongly, since (obj, imp) is strictly dominated by (reg, imp), we expect selection that is driven by (expected) fitness to invariably weed out maxmin players (obj, imp) in favor of (reg, imp), regret minimization. In terms of choice rules, this means that regret minimization is evolutionarily better than maxmin over the class of games considered. In terms of subjective preferences, it shows that players using the objective representation that directly looks at fitness (possibly money, or profit) are outperformed by nonveridical (regret) representations, when players’ beliefs are imprecise.
Next, if we look at the competition between all four types represented in table 1, (reg, imp) is no longer evolutionarily stable. Given behavioral equivalence (fact 2), types (reg, imp), (reg, prc), and (obj, prc) are all neutrally stable (Maynard Smith Reference Maynard Smith1982). But since (obj, imp) is strictly dominated and so disfavored by fitness-based selection, we are still drawn to conclude that maxmin behavior is weeded out in favor of a population with a random distribution of the remaining three types.
Simulation results of the (discrete time) replicator dynamics (Taylor and Jonker Reference Taylor and Jonker1978) indeed show that random initial population configurations are attracted to states with only three player types: (reg, imp), (reg, prc), and (obj, prc). The relative proportions of these depend on the initial shares in the population. This variability fully disappears if we add a small mutation rate to the dynamics. Take a fixed, small mutation rate ε for the probability that a player’s subjective utility or her subjective belief changes to another utility or belief. The probability that a player’s subjective representation randomly mutates into a completely different representation with altogether different utility and belief would then be ε 2. With these assumptions about “component-wise mutations,” numerical simulations of the (discrete time) replicator mutator dynamics (Nowak Reference Nowak2006) show that already for very small mutation rates almost all initial population states converge to a single fixed point in which the majority of players have regret-based utility. For instance, with , almost all initial populations are attracted to a final distribution with proportions:
(reg, imp) | (obj, imp) | (reg, prc) | (obj, prc) |
.289 | .021 | .398 | .289 |
What this suggests is that, if biological evolution selects behavior-generating mechanisms, not behavior as such, it need not be the case that behaviorally equivalent mechanisms are treated equally all the while. If mutation probabilities are a function of individual components, it can be the case that certain components of such behavior-generating mechanisms are more strongly favored by a process of random mutation and selection. This is exactly the case with regret-based preferences. Since regret-based preferences are much better in connection with imprecise beliefs than veridical preferences are, the proportion of expected regret minimizers, (reg, prc), in the attracting state is substantially higher than that of expected utility maximizers, (obj, prc), even though these types are behaviorally equivalent.
5.2. Analytical Results
Results based on the single meta-game in table 1 are not fully general and possibly spoiled by random fluctuations in the sampling procedure. Fortunately, for the case of symmetric games, the main result that maxmin types (obj, imp) are strictly dominated by regret minimizers can also be shown analytically for considerably general conditions.
Proposition 1 Let be the class of
symmetric games
generated by i.i.d. sampling a, b, c, d from a set of values with at least three elements in the support. Then, (reg, imp) strictly dominates (obj, imp) in the resulting meta-game.
Proof All proofs are in the appendix.
Corollary 1 Let be as in proposition 1. If we only consider imprecise belief types, (obj, imp) and (reg, imp), then the unique evolutionarily stable state is a monomorphic population of (reg, imp) players.
The result shows that there is support for the main conceptual point that we wanted to make: objective preference representations are not necessarily favored by natural selection; objective preferences are outperformed by nonveridical regret preferences if agents have imprecise beliefs. This tells us that the main conclusions drawn in the previous section based on the approximated meta-game of table 1 hold more generally for arbitrary symmetric games with i.i.d. sampled payoffs.
This result presupposes at least occasional imprecise beliefs. The assumed imprecise beliefs do not need to be maximally uncertain, however. Let the uncertainty held by a player be defined by a convex compact set of probabilities over the coplayer’s actions, where s is the lower probability and t is the upper probability of action II. We can then prove the following proposition, which is the analogue of proposition 1 for any possible (not necessarily maximal) degree of uncertainty [s, t], with
. There is only one difference: we are now going to require i.i.d. drawing of a continuous random variable with uniform distribution. This is due to the fact that, for arbitrarily small intervals [s, t], objective players (obj, [s, t]) and regret players (reg, [s, t]) can behave as holding a unique probability measure (precise belief) if the underlying payoff space is not dense. The reason for this technical requirement will become clearer from the proof.
Proposition 2 Let be the class of symmetric
games generated by i.i.d. drawing of a continuous random variable with uniform distribution over any set of values, and fix any imprecise belief [s, t]. Then the only evolutionarily stable state of a population with regret players (reg, [s, t]) and objective players (obj, [s, t]) is a monomorphic state of (reg, [s, t]) players.
This tells us that regret-based preferences can outperform objective preference representations when agents are also capable of learning or otherwise restricting their assumptions about the coplayer’s behavior as long as there is, at least on occasion, some imprecision in their beliefs. We will enlarge on the issue of belief formation after having covered some more relevant extensions in the next section.
6. Extensions
How do the basic results from the previous section carry over to richer models? Section 6.1 first introduces further conceptually interesting subjective representations that have been considered in the literature. Section 6.2 then addresses the case of symmetric two-player games for
. Finally, section 6.3 ends with a brief comparison to the case of solitary decision making.
6.1. More Preference Types
The space of possible preference types is enormous, and we have only compared regret and objective types so far. Let us now look at two other types of subjective preferences that have been investigated, especially in behavioral economics and in evolutionary game theory. A famous example is the altruistic preference (e.g., Becker Reference Becker1976; Bester and Güth Reference Bester and Güth1998), summoned to explain the possibility of altruistic behavior. At the other end of the spectrum, the competitive preference is located. The two subjective utilities are defined as follows:
1. Altruistic utility: for all
,
;
2. Competitive utility: for all
,
.Footnote 7
Table 2 shows results of Monte Carlo simulations that approximate the expected fitness in the relevant meta-game with all the subjective representations considered so far. These results confirm basic intuitions about altruistic and competitive types: everybody would like to have an altruistic coplayer and nobody would like to play against a competitive player. Perhaps more surprisingly, (alt, imp) comes up strictly dominated by (com, imp), but competitive types themselves are worse off against all types except against maxmin players (obj, imp) than any of the behaviorally equivalent types (reg, imp), (obj, prc), and (reg, prc). It is thus easy to see that the previous results still obtain for the larger meta-game in table 2: (reg, imp), (obj, prc), and (reg, prc) are still neutrally stable; simulation runs of the (discrete-time) replicator dynamics on the meta-game from table 2 end up in population states consisting of only these three types in variable proportion. In sum, the presence of other subjective representations, such as those based on altruistic or competitive utilities, does not undermine, but rather strengthens, our previous results.
Table 2. Average Evolutionary Fitness from Monte Carlo Simulations of 100,000 Symmetric 2 × 2 Games

(reg, imp) | (obj, imp) | (com, imp) | (alt, imp) | (reg, prc) | (obj, prc) | (com, prc) | (alt, prc) | |
---|---|---|---|---|---|---|---|---|
(reg, imp) | 6.663 | 6.662 | 5.829 | 7.105 | 6.663 | 6.663 | 5.829 | 7.489 |
(obj, imp) | 6.486 | 6.484 | 6.088 | 6.703 | 6.486 | 6.486 | 6.088 | 6.875 |
(com, imp) | 6.323 | 6.758 | 5.496 | 6.977 | 6.323 | 6.323 | 5.496 | 7.149 |
(alt, imp) | 5.949 | 5.722 | 5.326 | 6.396 | 5.949 | 5.949 | 5.326 | 6.568 |
(reg, prc) | 6.663 | 6.662 | 5.829 | 7.105 | 6.663 | 6.663 | 5.829 | 7.489 |
(obj, prc) | 6.663 | 6.662 | 5.829 | 7.105 | 6.663 | 6.663 | 5.829 | 7.489 |
(com, prc) | 6.323 | 6.758 | 5.496 | 6.977 | 6.323 | 6.323 | 5.496 | 7.149 |
(alt, prc) | 6.331 | 5.893 | 5.497 | 6.566 | 6.331 | 6.331 | 5.497 | 7.152 |
6.2. More Actions
Results from section 5 relied heavily on fact 2, which is no longer true when we look at arbitrary games. Table 3 gives approximations of expected fitness in the class of
symmetric games. Concretely, the numbers in table 3 are averages of evolutionary payoffs obtained in 100,000 randomly sampled symmetric games, where each fitness game G was sampled by first picking a number of acts
uniformly at random, and then filling the necessary
payoff matrix with i.i.d. sampled numbers, as before.
Table 3. Average Evolutionary Fitness for 100,000 Randomly Generated n × n Symmetric Games with n Randomly Drawn from {2, …, 10}

(reg, imp) | (obj, imp) | (com, imp) | (alt, imp) | (reg, prc) | (obj, prc) | (com, prc) | (alt, prc) | |
---|---|---|---|---|---|---|---|---|
(reg, imp) | 6.567 | 6.570 | 5.650 | 6.992 | 6.564 | 6.564 | 5.593 | 7.409 |
(obj, imp) | 6.476 | 6.483 | 5.896 | 6.818 | 6.484 | 6.484 | 5.850 | 7.124 |
(com, imp) | 6.468 | 6.647 | 5.512 | 7.169 | 6.578 | 6.578 | 5.577 | 7.354 |
(alt, imp) | 5.968 | 5.923 | 5.363 | 6.685 | 5.975 | 5.975 | 5.086 | 6.973 |
(reg, prc) | 6.908 | 6.918 | 5.988 | 7.456 | 6.929 | 6.929 | 5.934 | 7.783 |
(obj, prc) | 6.908 | 6.918 | 5.988 | 7.456 | 6.929 | 6.929 | 5.934 | 7.783 |
(com, prc) | 6.529 | 6.680 | 5.445 | 7.276 | 6.542 | 6.542 | 5.521 | 7.440 |
(alt, prc) | 6.450 | 6.337 | 5.772 | 6.978 | 6.457 | 6.457 | 5.479 | 7.500 |
The most important result is that the regret-minimizing type (reg, imp) is strictly dominated by (reg, prc) and by (obj, prc) in the meta-game from table 3. This means that while simple regret minimization can thrive in some evolutionary contexts, there are also contexts where it is demonstrably worse off. While this may be bad news for regret-minimizing types (reg, imp), it is not the case that regret types as such are weeded out by selection. Since, by fact 1, (reg, prc) and (obj, prc) are behaviorally equivalent in general, it remains that selection based on meta-games constructed from games will still not eradicate regret preferences.
On the other hand, there are plenty of ways in which the basic insights from propositions 1 and 2 can make for situations in which evolution would favor regret types, even in games. If, for example, the belief of a player is a trait that biological evolution has no bite on, but rather something that the particular choice situation would exogenously give us (possibly because of the different amount of information available in different choice situations), then regret-based preferences can again drive out veridical preferences altogether. For example, suppose that only preference representations compete and that agents’ beliefs are exogenously given, in such a way that both players hold precise (Bayesian) uniform beliefs with probability p and they both have maximally imprecise beliefs otherwise. This transforms the meta-game from table 3 into a simpler
meta-game in which the payoff obtained by a subjective preference is the weighted average over the payoffs of the subjective representations including that preference in table 3. Setting
for illustration, we get the meta-game in table 4. The only evolutionarily stable state of this meta-game is again a monomorphic population of regret types. Accordingly, all our simulation runs of the (discrete-time) replicator dynamics converge to monomorphic regret-type populations. The reason why regret-based utilities prosper is because they have a substantial fitness advantage when paired with imprecise beliefs (propositions 1 and 2). If unmeasurable uncertainty is exogenously given as something that happens to agents because of the information available in some choice situations, and even if that happens only very infrequently (i.e., for rather low p), regret preferences will outperform objective preferences, as well as competitive and altruistic preferences.
Table 4. Meta-game for the Evolutionary Competition between Subjective Utilities When Beliefs Are Exogenously Given (see Main Text)

reg | obj | com | alt | |
---|---|---|---|---|
reg | 6.926 | 6.926 | 5.942 | 7.757 |
obj | 6.924 | 6.924 | 5.948 | 7.751 |
com | 6.566 | 6.570 | 5.481 | 7.434 |
alt | 6.463 | 6.461 | 5.478 | 7.469 |
6.3. Solitary Decisions
To see how different choice mechanisms behave in evolutionary competition based on solitary decision making, we approximated, much in the spirit of meta-games, average accumulated fitness obtained in randomly generated solitary decision problems. For our purposes, a decision problem consists of a set of states of the world W, a set of acts A, and a payoff function
. We generate arbitrary decision problems by selecting, uniformly at random, numbers of states and acts
and then filling the payoff table, so to speak, by i.i.d. samples for each
. Unlike with two-player games, we need to also sample the actual state of the world, which we selected uniformly at random from the available states in the current decision problem. Accordingly, the fitness of choice mechanism c in decision problem D is given by:

with for all w. As subjective representations, we considered the original cast of four from table 1, since altruistic and competitive types are meaningless in solitary decision situations. As before, the relevant fitness measure, defined in equation (3), was approximated by Monte Carlo simulations, the results of which are given in table 5.

Facts 1 and 2 still apply: (reg, prc) and (obj, prc) are behaviorally equivalent in general, and (reg, imp) is behaviorally equivalent to the former two in decision problems with two states and two acts. This shows in the results from table 5 in that the averages for (reg, prc) and (obj, prc) are identical. But since we included decision problems with more acts and more states as well, the average for regret minimizers (reg, imp) is not identical to the one of (reg, prc) and (obj, prc). It is, in fact, lower, but again not as low as that of (obj, imp).
Table 5. Expected Fitness of Choice Mechanisms Approximated from 100,000 Simulated Solitary Decision Problems (see Main Text)

(reg, imp) | (obj, imp) | (reg, prc) | (obj, prc) |
---|---|---|---|
6.318 | 6.237 | 6.661 | 6.661 |
This means that every relevant result we have seen about game situations is also borne out for solitary decisions. Evolutionary selection based on objective fitness will not select against regret preferences, as these are indistinguishable from veridical preferences when paired with precise beliefs. But when paired with imprecise beliefs, regret-based utilities outperform objective utilities. Consequently, if there is a chance, however small, that agents fall back on imprecise beliefs, evolution will actually positively select for nonveridical regret-based preferences.
6.4. Sophisticated Beliefs
Since one of our main purposes was to illustrate the usefulness of a meta-game approach by the case study of objective and regret preferences, we have partially neglected an important and interesting issue, namely, the evolution of ways of forming beliefs about coplayers’ behavior or the actual state of the world. For reasons of space we must, unfortunately, leave a deeper exploration of belief type evolution to another occasion. Two remarks are in order nonetheless. First, belief type evolution can be studied without conceptual hurdles in the meta-game framework, so that there is no principled argument against the main methodological contribution of this article. Second, our results regarding the comparison between regret and objective types remain to be informative, even if we allow agents to learn or reason strategically.Footnote 8 This is because we know from fact 1 that regret and objective preferences come up behaviorally equivalent when paired with precise probabilistic beliefs (given identical decision rule). This holds no matter what the content of that belief is. So, if learning, reasoning, or statistical knowledge about a recurrent situation can be brought to bear, this will not make evolution select against regret-based preferences. If, on the other hand, agents resort to imprecise beliefs at least occasionally (e.g., when they are unaware of the coplayer or her utilities or when strategic reasoning cannot reduce all uncertainty about the coplayer’s choice), then regret-based preferences can be favored by natural selection over objective preferences.
7. Conclusion
The assumption that players and decision makers maximize their (subjective) utility is central through all economic literature, and the maximization of actual (objective) payoffs is often justified by appealing to evolutionary arguments and natural selection. In contrast to the standard view, we showed the existence of player types with subjective utilities different from the actual evolutionary payoffs that can outperform types whose subjective utilities coincide with the evolutionary payoffs. Here the claim is not that regret preferences are the best on the market, but rather that utilities that perfectly mirror evolutionary fitness can be outclassed by subjective utilities that differ from the objective fitness. While the literature on evolution of preferences has focused on fixed games, we have adopted a more general approach here. We suggested that attention to “meta-games” is crucial, because what may be a good subjective representation in one type of game (e.g., cooperative preferences in the Prisoner’s Dilemma) need not be generally beneficial.
Appendix Proofs
The proof of proposition 1 relies on a partition of , and on some lemmas. For brevity, let us denote the regret minimizer (reg, imp) by R and the maximinimizer (obj, imp) by M. Following equation (1), let
denote the expected payoff of choice mechanism X against choice mechanism Y on the possibly restricted class of fitness games
.
Proof of Proposition 1
By definition of strict dominance, we have to show that in the class of symmetric
games with payoffs sampled from a set of i.i.d. values with at least three elements in the support, it holds that:
(i)
(ii)
To show this we use the following partition of , based on payoffs parametrized as follows:
I | II | |
I | a | b |
II | c | d |
1. Coordination games
:
and
;
2. Anticoordination games
:
and
;
3. Strong dominance games
: aut (
and
) aut (
and
);
4. Weak dominance games
: aut
aut
;
5. Boring games
:
and
.
Before proving the lemmas, it is convenient to fix some notation. Let us call x, y, z the three elements in the support, and without loss of generality suppose that . We denote by C a coordination game in
with payoffs aC, bC, cC, and dC; similarly for games
,
,
, and
. Let us denote by IRC the event that a R-player plays action I in the game C; and similarly for action II, for player M, and for games A, S, W, and B. We first consider the case of i.i.d. sampling with finite support.
Lemma 1 R and M perform equally well in and in
.
Proof By definition of regret minimization and maxmin it is easy to check that whenever in a game there is a strongly dominant action a $, then a $ is both the maxmin action and the regret-minimizing action. Then, for all the games in , R chooses action a if and only if M chooses action a. Consequently, R and M always perform equally (well) in
. In the case of
it is trivial to see that all the players perform equally. QED
Lemma 2 In , R strictly dominates M.
Proof Assume without loss of generality that , and that
. There are two cases that we have to check: (i)
and (ii)
. In the first case it is easy to see that R and M perform equally: act I is the choice of both R and M. In the case of (ii) instead we have that I is the regret-minimizing action, whereas both actions have the same minimum and M plays
, since both I and II maximize the minimal payoff. Consider now a population of R and M playing games from the class
. Whenever (i) is the case R and M perform equally well. But suppose
and (ii) is the case. Then,
, whereas

Hence, we have that in general , and
. QED
Since it is not difficult to see that both (R, R) and (M, M) are strict Nash equilibria in , and that (R, R) and (M, M) are not Nash equilibria in
, the main part of the proof will be to show that R strictly dominates C in the class
, that is:
(i′)
(ii′)
This part needs some more lemmas to be proven, but first we introduce the following bijective function ɸ between coordination and anticoordination games.
Definition 3 (ɸ) The permutation defines a bijective function
that for each coordination game
with payoffs (aC, bC, cC, dC) gives the anticoordination game
with payoffs
. Essentially, ɸ swaps rows in the payoff matrix.
Lemma 4 Occurrence probability of C equals that of ɸ (C): .
Proof By definition, each game is such that
and
, and each game
is such that
and
. Given that a, b, c, d are i.i.d. random variables and that a sequence of i.i.d. random variables is exchangeable, it is clear that the probability of (aC, bC, cC, dC) equals the probability of (cC, dC, aC, bC). Hence,
. QED
Lemma 5 Let P(E) be the probability of event E, for example, P(IRC) is the probability that a random R-player plays act I in coordination game C, which is either 0, .5, or 1. It then holds that:
•
, and
;
•
, and
.
Proof It is easy to check that if , an R-player plays action I in C; that if
, R plays II; and that if
, an R-player is indifferent between I and II in C, and so randomizes with
. Similarly, if
, an R-player plays action I in A; if
, R plays II; and if
, an R-player is indifferent between I and II in A, and randomizes with
. Consequently, if
, then
, and by definition of ɸ we have
. Likewise, if
, then
; and if
, then
.
In the same way, in coordination games we have that if , an M-player plays I; if
, an M-player plays II; and if
, M is indifferent between I and II and plays
. In anticoordination games instead, if
, M plays I; if
, M plays II; if
, M plays
. By definition of ɸ:
if
;
if
; and
if
. QED
Lemma 6 It holds that:
•
;
•
.
•
;
Proof The event that R plays action I, IRC, with positive probability is the event that : if
, R plays I, and if
, R plays
. Similarly, the event that IMC has positive occurrence is the event that
: if
, M plays I, and if
, M plays
. Then, IRC implies that
, and IMC implies that
. Moreover, on the assumption that
, it is easy to check that
implies
. Hence, in any C with
it holds that IMC implies IRC, that is,
. Instead, it is possible that
,
, and
hold simultaneously, so that
. By a symmetric argument it can be shown that
too. Finally, when
it holds that:
iff
;
iff
; and
iff
. Hence,
. QED
We are now ready to prove that . With notation like
denoting the probability that a random R-player plays I and another R-player plays I as well in game C, rewrite the inequality as:

By lemma 4 and lemma 5, we can express everything in terms of C only:

This simplifies to:

Now let us split into and
, and consider
first. Notice that, by lemma 6, the case
is irrelevant in order to discriminate between R and M. If
, by lemma 6 we can eliminate the cases where R plays II and M plays I:

We now distinguish between two cases: (1) and (2)
. Notice that
if and only if case (1) obtains, and that
and (1) imply IIMC. Then, from (1) we have:Footnote 9

Since we have assumed , the last inequality is not satisfied. We have instead:

This means that where and where (1) is the case, R and M are equally fit. This changes when we turn to (2). In that case, since
by lemma 6, we have that
. Moreover, when
,
implies
(see lemma 6). Consequently, when M plays either I or
, R always plays I. Hence, whenever
and (2) obtain, it also holds that
. In this case we can simplify to:

We know that IRC implies that . Since we have assumed that
, we have that
. Hence, the inequality

is satisfied. So, when , R strictly dominates M. Symmetrically, from
and by distinguishing between the two cases (1) and (2) as before, in the end we get:
(1)
; and
(2)
.
Hence, we can conclude that R strictly dominates M in the class .
It remains to be shown that . As before, spell this out as:

Similarly to the above derivation, we consider first, and we now distinguish between (1)
, (2)
, and (3)
. Notice that either (1) or (2), together with
, implies IRC. Then we obtain:Footnote 10
(1)
(2)
;
(3)
.
When , the derivation proceeds symmetrically and we get:
(1)
(2)
;
(3)
.
Finally, we can conclude that .
When we have i.i.d. sampling with continuous support, games in and
never occur, and the proof for the other cases reduces to the proof of proposition 2 for
and
.
Proof of Proposition 2
As shown in figure A1a, given a game (a, b, c, d), action I corresponds to the line , while action II corresponds to the line
. The slope of action I is then (
), and the slope of action II is (
). Action I is steeper than action II if
, and action II is steeper than action I if the reverse of the last inequality holds. Given a belief
, let us define

Next, type R is indifferent between the two acts if

and prefers I over II if

For succinctness, let us abbreviate . Whenever
and
, it is the case that
, so that M prefers I over II. Indeed, when
, we have that
. When we enlarge the interval Γ by moving s to the left of
and t to the right of
by the same extent, such that
, we get that
, since a′ moved by

while d′ moved by

Consequently, in a game where action II is steeper than action I, the only possible way in which the two types can differ is M playing I and R playing II, since we will never observe M playing II and R playing I in such games. By a similar argument, when action I is steeper than action II, the only possible way in which the two types can differ is M playing II and R playing I. Hence, whenever one of the actions is steeper than the other, the two types can differ in only one of the two possible ways. Finally, when action I and action II are equally steep, , then one type strictly prefers one action if and only if the other type does too. We say that a game is relevant if the two types play different actions.

Figure A1. Examples of coordination game C and corresponding anticoordination game ψ(C).
Since the two cases are symmetric, consider the case . Suppose that c and d have been drawn such that
. For
to hold, the game has to be a coordination game, otherwise we would have
and
, so
.
If the game is a coordination game, the two types choose differently if and only if

and

Consider all the points . Each of these points can be expressed as a linear combination:

for . By simple algebra, for each point in
and
, the point

expresses the expected value of action II for

that is, it is the y-value of the line corresponding to action II when

Then, given a two-dimensional point (x 0, y 0), the sheaf of lines passing through that point is defined by all the equations

for . (The vertical line
is excluded from the sheaf, but it is not relevant for the proof.) Consequently, given the two-dimensional point

the sheaf of lines passing through that point is defined by the set of equations, for :

If, for each equation in the set, we define

then each equation corresponds to a possible game
I | II | |
---|---|---|
I | a ⋄ | b ⋄ |
II | c | d |
such that

By algebraic computations, the condition is equivalent to

Moreover, among the coordination games such that , the relevant ones are those that also satisfy
, otherwise type M would not (strictly) prefer I over II. If we rewrite a′, b′, c′, d′ as in equation (4), then the inequality
reduces to

By symmetric arguments, whenever , the only possible relevant games for the same interval [s, t] are anticoordination games for which

and such that , and that
. Similarly, for
, these correspond to all games

such that

and

Consider now the following bijective function between coordination and anticoordination games, that, for
, associates the coordination game

with the anticoordination game

Essentially, ψ changes c to d, m to −m, and k to . In particular, note that ψ is a bijection that, for a fixed interval [s, t], sends relevant coordination games to relevant anticoordination games. Figure A1 gives a graphical example of the bijection.
We can then pair these two games and consider the average fitness in {C, ψ(C)} of (reg, [s, t]) against (reg, [s, t]), and then compare it to the fitness of (obj, [s, t]) against (reg, [s, t]), denoted respectively by M and R henceforth. In the pair of relevant games C and ψ(C), R strictly dominates M if

and

Consider the first inequality. Since both C and ψ(C) are relevant with respect to the interval [s, t], it implies that , and
. Therefore, the first inequality is equivalent to

which can be spelled out as

After some computations, the previous inequality boils down to

which we know is the case, since we have seen that the condition is equivalent to
.
Finally, from the previous argument it follows that, for any given interval [s, t], if we consider the set of all relevant coordination games, is denote it by , and the set of all relevant anticoordination games
, then it holds that

Let us now check that the second inequality for R to strictly dominate M also holds. In {C, ψ(C)}, that is equivalent to

which amounts to . As before, it then follows that

Therefore, R strictly dominates M.