1. INTRODUCTION
Consider a sequence of outcomes of descending value, A > B > C > . . . > Z. The continuity axiom of Expected Utility Theory implies, roughly, that for any certain outcome, there will be a lottery with a probability p of a gain and a probability (1−p) of a loss, which is at least as good, however small the gain and however great the loss, if p is sufficiently large. According to Larry Temkin (Reference Temkin, Egonsson, Josefsson, Petersson and Rønnow-Rasmusen2001), many people – including Temkin himself – are inclined to deny the continuity axiom in certain ‘extreme’ cases, i.e. cases of triplets of outcomes A, B and Z, where A and B differ little in value, but B and Z differ greatly. But, Temkin argues, contrary to what many people think, rejection of the axiom of continuity in such cases implies ‘deep and irresolvable difficulties for expected utility theory’ (Reference Temkin, Egonsson, Josefsson, Petersson and Rønnow-Rasmusen2001, p. 95). More precisely, he attempts to show that ‘if one rejects the principle of continuity in some ‘extreme’ cases, then one must reject continuity even in the ‘easy’ cases for which it seems most plausible [i.e. the cases where the loss is small], reject the axiom of transitivity, or reject the principle of substitution of equivalence’ (Reference Temkin, Egonsson, Josefsson, Petersson and Rønnow-Rasmusen2001, p. 107, Temkin's emphasis). For if we assume continuity for ‘easy’ cases, we can derive continuity for the ‘extreme’ case by applying the axiom of substitution and the axiom of transitivity: the rejection of continuity for ‘extreme’ cases therefore renders the triad of continuity in ‘easy’ cases, the axiom of substitution and the axiom of transitivity inconsistent.
There are several problems with Temkin's argument for this alleged inconsistency. First, as Gustaf Arrhenius and Wlodek Rabinowicz (Reference Arrhenius and Rabinowicz2005) have pointed out, Temkin's proof of the inconsistency is itself inconsistent, because it implicitly assumes the continuity axiom and thus already assumes what is to be proved. They prove an alternative theorem, where continuity for ‘easy’ cases is substituted by continuity for triplets of adjacent outcomes in a sequence of descending value A > B > C > . . . > Z, which together with transitivity and the substitution axiom implies that some compound lottery on the outcomes in the sequence that involves a risk of the catastrophic outcome Z is equally as good as B. Hence, they do not strictly prove the inconsistency Temkin is after. However, I shall show that Arrhenius’ and Rabinowicz’ result can in fact be strengthened so that continuity for triplets of adjacent outcomes in the sequence, together with transitivity and the substitution axiom, implies continuity for the extreme case.
Secondly, as for the implications, Arrhenius and Rabinowicz rightly point out that the substitution axiom is not a matter of logic as Temkin suggests, but rather a not entirely uncontroversial axiom. More importantly, Temkin conducts his discussion using the terms ‘easy’ case and ‘extreme’ case, which refer to what he calls the ‘significance’ of the value difference between outcomes; but the theorem is stated purely in terms of a betterness relation defined on lotteries over a sequence of basic outcomes of descending value – it says nothing about the value difference between these outcomes.
Hence, it can only be concluded that, given the axiom of substitution and the axiom of transitivity, rejection of continuity for the triplet A, B and Z is inconsistent with continuity for triplets of adjacent outcomes in the sequence. If this result is applied to a sequence where each triplet of adjacent outcomes is an ‘easy’ case, while at the same time the choice between a lottery between A and Z and B with certainty is an ‘extreme’ case, it seems that Temkin will have what he is after: that the rejection of continuity for some ‘extreme’ cases leads to a trilemma whether either continuity for ‘easy’ cases or transitivity or the substitution axiom will have to be rejected.
However, I shall argue that Temkin's trilemma never gets off the ground. This is because Temkin appears to appeal to two mutually inconsistent conceptions of the aggregation of value. Once these are clearly separated, it will transpire, in connection with each of them, that one of the principles to be rejected does not appear plausible. Hence, there is nothing surprising or challenging about the result; it is merely a corollary to Expected Utility Theory.
2. THE FRAMEWORK
Let ‘≥’ denote ‘– is at least as good (all things considered) as –’,Footnote 1 defined on the set of all probability distributions on a finite sequence of outcomes A > B > C > . . . > Z, where ‘>’ (– is better than –) is defined in the usual way.Footnote 2 The probability distribution (or lottery) that gives A with probability p and B with probability (1−p) is denoted by (A, p, B). More generally, a probability distribution on the sequence of outcomes assigns a probability greater than 0 to each outcome in a subset of the sequence, where these probabilities sum up to 1. I shall use italicized capital letters to denote probability distributions (or lotteries) in general (be they certain outcomes or simple or compound lotteries) and non-italicized capital letters to denote members of the sequence of basic outcomes when the focus is solely on outcomes.
The axiom of transitivity of Expected Utility Theory says:
(T) For all lotteries A, B, C: if A ≥ B and B ≥ C, then A ≥ C.
The axiom of substitution says:Footnote 3
(S) For all lotteries A, B and C: if A ≥ B and p is a probability from the open interval ]0;1[, then (A, p, C) ≥ (B, p, C).
Temkin does not mention the axiom of completeness:
(C) For any two lotteries A and B, either A ≥ B or B ≥ A,
which is also a necessary axiom of Expected Utility Theory; however, the result I shall present below (Observation 2) does not depend on this axiom.Footnote 4 Another assumption Temkin implicitly relies upon is the axiom of reduction, which states that if a compound lottery is reduced to a simple lottery by the laws of probability, these two lotteries are equally good.Footnote 5 Finally, the continuity axiom says:
For any three lotteries A > B > C, there exist probabilities p and q, both from the open interval ]0,1[, such that (A, p, C) ≥ B ≥ (A, q, C).
In this paper, I shall only be concerned with the component stating (A, p, C) ≥ B.
Suppose the probability distribution X assigns the probabilities pX A to outcome A, pX B to B, and so on; and that the probability distribution Y assigns probabilities pY A to outcome A, pY B to B, and so on – where some of these probabilities can be 0. From the axioms stated above follows the conclusion of Expected Utility Theory:
There exists a real-valued function u defined on the sequence of outcomes, such that
(1) X ≥ Y if and only if pX Au(A) + pX Bu(B) + . . . pX Zu(Z) ≥ pY Au(A) + pY Bu(B) + . . . pY Zu(Z)
(2) Moreover, v is another representation of ≥ in this sense, if and only if there are constants a > 0 and b, such that v = au + b (i.e. v is a linear transform of u).
In other words, given the axioms, it is possible to assign real numbers (utilities) to each of the outcomes in the sequence in such a way that a probability distribution X is as least as good as another Y if and only if the expected utility of X is at as least great as the expected utility of Y; and a utility function represents the betterness relation in this expected utility format, if and only if it is a linear transformation of any other utility function representing the betterness relation in expected utility function format.
3. TEMKIN'S CLAIMS
Temkin describes an ‘easy’ case for the triplet A, B and C as
Outcomes B and C differ little in value and outcomes A and B differ little (or greatly).
He describes an ‘extreme’ case for the triplet as
Outcomes B and C differ greatly in value, but outcomes A and B differ little.
To illustrate, he gives the following example:
A: having $ 1,000,001 per year throughout a long life
B: having $ 1,000,000 per year throughout a long life
C: having $ 999,999 per year throughout a long life
. . .
Z: having $ 0 per year throughout a long life
In connection with A, B and C here, continuity implies that there exists some p, such that (A, p, C) ≥ B. Since the loss by getting C instead of B is ‘relatively insignificant’, Temkin suggests that ‘most people’ would readily accept the lottery, given a sufficiently high p.
This is an ‘easy’ case. By contrast, A, B and Z represent an ‘extreme’ case, Temkin suggests. What we stand to gain if we get A rather than B is insignificant, while what we stand to lose if we get Z rather than B is very significant. According to Temkin, many people would deny that there exists a p < 1, such that (A, p, Z) ≥ B, and hence deny continuity between A, B and Z.
Temkin's point here is that it cannot be a requirement of rationality, when faced with a very good outcome that is certain, to run a risk of a serious loss, however improbable that is, for the sake of a small gain, however probable. He stresses that the example is not a real-life choice. In real life no outcomes are certain. ‘Certainty’ in Expected Utility Theory is a theoretical construct, a limiting case. Also, the example is a once in a lifetime choice; but in real life we are faced repeatedly with such choices and have to decide on a strategy for such repeated choices.
An extreme case like this nevertheless poses a threat to Expected Utility Theory. Yet, Temkin says, ‘many would deny that the example poses a serious Footnote 6 threat to expected utility theory. They might readily accept that continuity fails in ‘extreme’ cases [. . .], yet they might insist that continuity holds for the vast majority of cases. Minimally, they might urge that continuity holds for the so-called ‘easy’ cases [. . .] and hence that expected utility works fine for at least those cases’ (p. 99).
Temkin now argues that examples of ‘extreme’ cases ‘pose a much greater challenge to Expected Utility Theory than is normally recognised.’Footnote 7 If we deny continuity for extreme cases, a contradiction can be demonstrated between continuity even for ‘easy’ cases and two of the other axioms of Expected Utility Theory, namely: the substitution axiom and the transitivity of ‘– is better than –’. In other words, Temkin claims that (T), (S), continuity for ‘easy’ cases and the rejection of continuity for the ‘extreme’ case are mutually inconsistent.Footnote 8
Temkin wants to prove that, if we assume continuity for ‘easy’ cases, then, given the axiom of substitution and the axiom of transitivity, continuity follows for the ‘extreme’ cases as well. The problem is that he does not present a genuine proof of this claim. Considering the example above, he says that if we compare B for certain with a lottery between A and C ‘the gap between the certain alternative and the worst risky alternative is small’. Hence, ‘most people would readily accept the requirement of continuity’, such that, for some p, they are indifferent between B and (A, p, C). ‘That is’, Temkin continues, ‘there must be some p such that VB = pVA + (1-p)VC’ (p. 99), and he means by this, apparently, that the value of B equals the expected value of the lottery (A, p, C). But the latter does not follow from the isolated continuity requirement that there is some p for which (A, p, C) ≥ B. Temkin's value representation will only follow if we assume all the axioms of Expected Utility Theory and let the value function V be a linear transformation of utility. But then we would already have assumed continuity for all cases, which is what Temkin wanted to prove.
4. THE ARRHENIUS–RABINOWICZ THEOREM AND A STRONGER RESULT
The idea behind Temkin's proof was to proceed on the assumption that there is equivalence between C and, for some q, the lottery (B, q, D). But then, since B ≈ (A, p, C), we could use (S) to substitute C here with (B, q, D) and obtain B ≈ (A, p, (B, q, D)). We then obtain the result that B for certain is equivalent with a lottery between A and a lottery involving some risk of D. If we reiterate this procedure, we shall eventually reach the result that B for certain is equivalent with a lottery between A and a compound lottery involving some risk of Z. Arrhenius and Rabinowicz (Reference Arrhenius and Rabinowicz2005) use this idea in their proof. However, they assume continuity only between adjacent triples of outcomes in a sequence of descending outcomes; they do not assume anything about the size of value differences (2004, p. 183):
Observation 1
Consider any descending outcome sequence, A1 > A2 > . . . > An, where n ≥ 3. Continuity for adjacent outcomes in the sequence, together with (S)Footnote 9 and (T) for ≈, entails that A2 is equally as good as some compound lottery on the outcomes in the sequence that involves a risk of ending up with An.
Arrhenius and Rabinowicz provide both an instructive proof for the case of a five-member sequence and an elegant proof, by mathematical induction, for the general case of n members. What they prove is not, strictly speaking, that continuity for adjacent triplet of outcomes implies continuity in the ‘extreme’ case, as intended by Temkin. However, it is possible to obtain this result by elaborating their proof by mathematical induction:
Observation 2
Consider any descending outcome sequence, A1 > A2 > . . . > An, with n ≥ 3. Continuity for adjacent outcomes in the sequence, together with (S) and (T), entails that there is some probability p from the open interval ]0;1[ such that (A1, p, An) is at least as good as A2.
Proof of Observation 2 by mathematical induction
Basis of the induction: n = 3: There exists p 3 from the open interval]0;1[such that (A1, p 3, A3) ≥ A2 (from continuity of adjacent outcomes).
It shall be established that if the observation holds for n, then it also holds for n + 1.
(1) Hypothesis of the induction: (A1, pn, An) ≥ A2, where pn belongs to the open interval ]0;1[
(2) There exists q from the open interval ]0;1[ such that (An –1, q, An +1) ≈ An (from continuity of adjacent outcomes)
(3) A1 ≥ An-1 (from A1 > A2 > . . . > An)
(4) (A1, q, An +1) ≥ (An –1, q, An +1) (from (3) and (S))
(5) (A1, q, An +1) ≥ An (from (2), (4) and (T))
(6) (A1, pn, (A1, q, An +1)) ≥ (A1, pn, An) (from (5) and (S))
(7) (A1, pn, (A1, q, An +1)) ≥ A2 (from (6), (1) and (T))
(8) (A1, pn+q–pnq, An +1) ≥ A2 (reduction of (7))
(9) There exists pn +1 = pn+q–pnq belonging to the open interval ]0;1[ such that (A1, pn +1, An +1) ≥ A2 (from (8))
QED
5. DISCUSSION
Does this result give Temkin what he wants? Not immediately, because the theorem is about any descending outcome sequence, whereas Temkin talks about a gradually descending sequence in which every element is only insignificantly better than the immediately succeeding one. It is the insignificant difference in value of three adjacent outcomes that makes continuity between them an ‘easy’ case. But a descending sequence need not be made up of ‘easy’ cases; for instance, the sequence A, B, Z is assumed not to be an ‘easy’ case.
In fact, it follows directly from Observation 2 that, if we assume discontinuity for the ‘extreme’ case, then at least one triplet of adjacent outcomes will also be discontinuous, given (S) and (T), and hence at least one triplet of adjacent outcomes will not be an ‘easy case’. In other words, consider a sequence of descending outcomes A1 > A2 > . . . > An, with n ≥ 3, where any lottery involving A1 and An is discontinuous with A2, i.e. for all p in the open interval ]0;1[, A2 > (A1, p, An). It follows from Observation 2 that, given (S) and (T), there is some triplet (Am -1, Am, Am +1) in the sequence of triplets of adjacent outcomes (A1, A2, A3), (A2, A3, A4), . . .., (An –2, An –1, An) for which there is no p, such that (Am -1, p, Am +1) ≥ Am.
Given the framework of Expected Utility Theory, this is not surprising. Within Expected Utility Theory, continuity plays the role of an Archimedean axiom.Footnote 10 Consider Temkin's imagined sequence of outcomes, A > B > . . . > Z, and assume that the axioms apart from continuity, i.e. completeness (C), transitivity (T) and the substitution axiom (S), all hold. Suppose we set the utility of A, u(A), to 1, and the utility of Z, u(Z), to 0. We can then define the utility of any other outcome N, u(N), by u(N) = p, where p is a number from the open interval ]0;1[ given by (A, p, Z) ≈ N. It is precisely the job of the continuity axiom to ensure that such a p exists. Hence, p measures the fraction u(N)–u(Z)/u(A)–u(Z), and (1–p) measures the fraction u(A)–u(N)/u(A)–u(Z). Since, p is a number from the open interval ]0;1[, any utility difference u(N)–u(Z) (and similarly any difference u(A)–u(N)) is a finite fraction of the difference u(A)–u(Z).
If we deny that there is a p from the open interval ]0;1[, such that (A, p, Z) ≥ B, i.e. deny continuity in this case, it will follow that u(A)–u(B)) cannot be measured as a finite fraction of the difference u(A)–u(Z). And as we have just seen, it follows from Observation 2 that this infinite value difference will set in between at least one pair of adjacent outcomes, which will then represent the borderline between acceptable and unacceptable outcomes.Footnote 11
Although Temkin accepts this conclusion on the one hand, he nevertheless, on the other hand, maintains that, in his example, any triplet of adjacent outcomes represents an ‘easy’ case – after all, the difference between adjacent outcomes is only $1 – and hence that transitivity, the axiom of substitution and the rejection of continuity for an ‘extreme’ case force us to reject continuity for a plausible ‘easy’ case.
Temkin leaves the impression that the value difference between B and Z, although significant and representing a discontinuity, is still finite because it can be bridged by a finite number of steps small value differences, since each dollar represents a finite value difference, and we reach from $1,000,000 to $0 in a finite number of steps.Footnote 12
At the same time, he appears to imply that no chance of a gain from B to A can ever be greater in numerical value than a risk of a loss from B to Z, however improbable. These intuitions violate the additive structure Expected Utility Theory, where the value of a lottery is measured by the sum of the value of each outcome, weighted by its probability. If the value difference between B and Z is finite, then regardless of how small u(A)–u(B) is, p((u(A)–u(B)) can always be made larger than (1–p) ((u(B)–u(Z)).
As has been shown in Jensen (Reference Jensen2008), we could accept the value stipulations envisaged by Temkin in a context where value is not additive. This would imply that the value of (A, p, Z) cannot be represented as a linear combination of the value of each of the outcomes A and Z, i.e. as pu(A)+(1–p)u(Z). Rather, as p approaches 1, the value of (A, p, Z) will approach an upper bound which is lower than the value of B (and thus also lower than the value of A), see Figure 1. Hence, in this case, discontinuity for the extreme case does not represent an infinite value difference.
On this conception, Temkin would be in a position to defend the rejection of continuity for an ‘extreme’ case and at the same time insist on continuity for all adjacent triplets of outcomes in ‘easy’ cases. However, the cost would be a clear violation of the axiom of substitution, which is necessary for the additive structure of Expected Utility Theory. This can be seen in the proof of Observation 2. Here, substitutions (together with rearrangements and transitivity) are used to ‘extend’ continuity between A1, A2 and An to continuity between A1, A2 and An +1 with the help of continuity between An and its adjacent outcomes. But if there is discontinuity for the ‘extreme’ case, this reiterated extension procedure will have to be rejected for some n at step 4.Footnote 13
Interestingly, Temkin hints at non-additive aggregation in his defence of the rejection of continuity for ‘extreme’ cases (p. 105). However, he overlooks the fact that this would violate the substitution axiom. So much for the substitution axiom as a matter of logic.
6. CONCLUSION
Rejection of the axiom of continuity clearly has consequences for Expected Utility Theory. Either the substitution axiom and thereby the additive framework of Expected Utility Theory is accepted. The consequence is then an infinite value difference between acceptable and unacceptable outcomes. But this consequence is not necessarily ‘deep’ and it is definitely not ‘irresolvable’. There are non-Archimedean versions of Expected Utility Theory. One approach is to use non-standard analysis (i.e. infinitesimal numbers);Footnote 14 another is to operate with two-dimensional utility.Footnote 15 Or Temin's value stipulations are accepted. Then the consequence is that the substitution axiom will appear implausible; it has to be rejected in order to accommodate for the implicitly assumed non-additive form of aggregation. But then the framework of Expected Utility Theory is disregarded from the outset.
Since I can see no reason to adopt the continuity axiom other than it is a technically necessary condition of representation by a real-valued expected utility function, I agree with Temkin that continuity cannot be a requirement of rationality. But then its abandonment cannot have serious implications for practical reasoning. I also believe that considerations concerning aggregation might provide a reason to drop the substitution axiom. Here I do not wish to take a stand on this issue. But I do want to conclude that we can discard the continuity axiom without being confronted by the serious trilemma set up by Temkin. The impression that this cannot be done arises from Temkin's confusion of additive and non-additive aggregation.