1. Introduction
Lewis proposed the Principal Principle in his seminal work (Reference Lewis1980). He argued that when a subjectivist believes in objective chance, the credences he assigns to events and the credences he assigns to possible values of the objective chances of these events cannot be arbitrary if the subjectivist is reasonable: these credences (apart from making a probability measure) have to satisfy a relation, namely, the relation that Lewis calls Principal Principle. Accordingly, in Lewis’s work, credence is assigned to at least two kinds of propositions: those expressing real events (e.g., the coin fell heads; we will call them “chancy events”) and those expressing that the objective chances of chancy events are equal to certain values (e.g., the chance of heads is, say, 35%). Recently, Rédei and Gyenis (Reference Rédei and Gyenis2013) have proposed at various forums, such as the PSA Biennial Meeting in 2014, that the Principal Principle makes sense only if whatever credence function is defined on the chancy events, such credence can be extended in a consistent manner to events concerning values of the objective chances of the chancy events. They presented their claim using the language of Bayesianism, but this is essentially what they proposed.
In this work I show that the extendibility Rédei and Gyenis require essentially always holds. Before doing so, however, I present an argument disputing the claim of Rédei and Gyenis that this extendibility is indeed necessary for the Principal Principle to be meaningful.
2. Credences of Events, Credences of Objective Chances
In this section we turn to Lewis (1980) for guidance and inspiration. We examine one of the examples he presented to motivate the Principal Principle, and we make some general observations. The example and the observations will guide us throughout the current work: On the one hand, they indicate how to construct the embedding Rédei and Gyenis require for their consistency. On the other hand, they help us understand the meaning of the Principal Principle, why it is a reasonable requirement, and why it would be a reasonable requirement even if Rédei and Gyenis’s consistency notion happened to fail to hold.
2.1. An Example by Lewis
Lewis’s (Reference Lewis1980, 266) example is the following:
Suppose you are not sure that the coin is fair. You divide your belief among three alternative hypotheses about the chance of heads, as follows.
• You believe to degree 27% that the chance of heads is 50%.
• You believe to degree 22% that the chance of heads is 35%.
• You believe to degree 51% that the chance of heads is 80%.
Then to what degree should you believe that the coin falls heads? Answer. (27% × 50%) + (22% × 35%) + (51% × 80%); that is, 62%. Your degree of belief that the coin falls heads, conditionally on any one of the hypotheses about the chance of heads, should equal your unconditional degree of belief if you were sure of that hypothesis. That in turn should equal the chance of heads according to the hypothesis: 50% for the first hypothesis, 35% for the second, and 80% for the third. Given your degrees of belief that the coin falls heads, conditionally on the hypotheses, we need only apply the standard multiplicative and additive principles to obtain our answer.
In this example, Lewis states that given an agent who believes in objective chance, and given certain degrees of belief (credences; 27%, 22%, and 51% in this case) that the agent assigns to the possible objective chances (50%, 35%, and 80% in this case) of a certain chancy event (heads in this case), the reasonable credence that the agent has to assign to the event itself is already determined (62% in this case).
More precisely, in the example a coin is tossed, and there are two possible chancy outcomes: H (heads) and T (tails). The credence the subjectivist ends up associating to H is . However, this assignment of credence is not arbitrary: it is a derived credence. That is, the subjectivist believes in objective chance, and there is a primary credence associated to the possible objective chances of H, and the credence of H is derived from this primary credence:
• Subjectivist believes in objective chance of H.
• He thinks there are three possibilities for the objective chance of H:
(•)He assigns the following credences to the possible objective chances:
(•)C(H) then is computed using the following formula:
(1)
Note that in the situation above, it is implicitly assumed that assigning credences to values of the chances of certain events makes sense. In fact, we should not be surprised. Lewis wrote about the subjectivist’s guide to objective chance; that is, objective chance is part of the world, and credence can be assigned to possible values of it. The algebra of events is generated by at least five events concerning the objective world: two events for the possible chancy outcomes H and T and three more events for the possible values of the objective chance of H (and hence of T).
Lewis assumes (and Pettigrew [Reference Pettigrew2012] showed that this should indeed be assumed) that credence and chance are probability measures abiding the laws of probability theory. Accordingly, assigning credences to statements about the chances of events only makes sense if those statements signify events of an underlying event algebra. That underlying event algebra also contains the actual chancy events, because credences are assigned to them as well. Lewis’s example indicates that a credence function defined on the possible values of the objective chances of chancy events and also on the possible chancy events themselves is not an arbitrary probability measure: it has to satisfy a consistency condition (i.e., eq. [1] in the example) on top of being a probability measure.
2.2. General Observations
Where does equation (1) come from? If A is some event of the real world, is the chance function over such events at some time t, and C is the credence function, the general idea that is expressed by the above example can be written as
, where the sum runs through all r values
can possibly take. Clearly, in case of a continuum, the summation has to be replaced by an integral. If we throw in Lewis’s admissible evidence too (think of it as facts from the past), then the above equation takes the following form:

Note that if C satisfies the laws of probability, then the following statement always holds:

as long as , that is, as long as r runs through all possible values of
. The Principal Principle requires that

hold for all admissible E events. Lewis (Reference Lewis1980) noted that equation (3) together with the Principal Principle (4) immediately implies equation (2).
The meaning of the Principal Principle is of course that a reasonable agent should choose the credence such that if the objective chance of a chancy event A is r according to the evidence collected by the agent, then the credence of A should be the same r unless some further inadmissible evidence E overwrites this. For example, if the agent has evidence that a coin is biased such that the chance of H is 0.8, then he should choose the credence of H to be 0.8, unless he has some further evidence that overwrites this choice, for example, when the toss already happened and he knows the result. Without further such evidence, for example, just before the coin is tossed, the reasonable credence to be assigned to H is 0.8.
Suppose now the agent learned that and updated his credence function accordingly to
:

for all B. That is, the agent added to the set of evidences. Is it true that this new credence function automatically satisfies the Principal Principle for any
condition, or are additional assumptions needed?
Just from assuming that C satisfies the Principal Principle without admissible E events, it does not follow that also satisfies the Principal Principle. However, it does follow from the Principal Principle with admissible E events included, as long as events of the form
are admissible. And indeed, Lewis insists that such events must be admissible. Then, applying the Principal Principle to
instead of E, we have

Now consider this: remember, equation (2) says, for any admissible evidence E, , where the sum runs through all r values that
can possibly take. We have seen that this follows from the Principal Principle if the credence function obeys the laws of probability. The other direction is also true: the assumption that the credence C satisfies the laws of probability and that events of the form
are admissible, together with (2), implies the Principal Principle. To see this, consider that

where we applied equation (2) not for E but for . But

for any r, because if C satisfies the laws of probability, then if
and
if
no matter what E is. With this, equation (5) becomes

Hence, given that the laws of probability are satisfied and that events of the form are admissible, the Principal Principle and the validity of equation (2) for all admissible E are in fact equivalent. This is an important observation, because as (2) implies the Principal Principle, it gives us a hint how to construct in the sections below the extension Rédei and Gyenis require for their consistency notion: namely, if the credence of each chancy event A is computed from the credences assigned to the possible values of the objective chance of A using equation (2), then the credence function on the chances and on the chancy events satisfies the Principal Principle.
3. Consistency Questions of Rédei and Gyenis
Rédei and Gyenis (Reference Rédei and Gyenis2013) raised the question what it really meant to conditionalize with the event that the chance of something is a certain value. They posed their question in a more abstract and mathematically well-defined way than how we have proceeded so far. It is essentially as follows.
“Chance” in their terminology is replaced by a probability measure over an algebra
representing the chancy events (corresponding to H and T in the example). They drop time t to make the discussion simpler, as it is not necessary for presenting their complaints. Credence is replaced by a probability measure
over some algebra
. They observe that it is necessary that
includes
, as we want to associate credence at least to the real events. Furthermore,
has to contain events of the form
, meaning the event that “the objective chance of A equals r.” I intentionally do not follow their notation
because I want to be able to write
meaning “the objective chance of A equals
,” where
is a probability measure on
. Rédei and Gyenis define the abstract Principal Principle:
Definition 1. The subjective probabilities are related to the objective probabilities
, as

as long as the conditioning makes sense.Footnote 1
Remark 1. Unfortunately, Rédei and Gyenis wrote their Principal Principle in the form , which is confusing as it is not correct (it is not the Principal Principle, and in fact it should not be required) when r is not the objective probability
. It is clear, however, from their discussions of their formula that they only consider the case when
. Hence, I wrote it in the form of equation (7) to avoid this confusion. In our treatment,
is just a measure over
; it may actually be different from the real objective chance, although Rédei and Gyenis are not concerned with this situation. (In fact, the Principal Principle as Lewis stated it has nothing to say about the real objective chance; it only considers evidence about the objective chance.)
Rédei and Gyenis have another notion too, the stable abstract Principal Principle, which also requires that given any , for all
,

as long as the conditional probabilities make sense. In other words, learning the objective probability of B should not destroy the Principal Principle for A.
Rédei and Gyenis also define their strong consistency condition:
Definition 2. The abstract Principal Principle is defined to be strongly consistent if the following hold: Given any probability space and another probability measure
on
, there exists a probability space
and a Boolean algebra embedding h of
into
such that for every
,
, and there exists an
with the property
, and if
and
then
.
Clearly, would be the event “the objective probability of A is
.”
Rédei and Gyenis, then, have two claims:
• In order for the (abstract) Principal Principle to be meaningful, it is necessary that strong consistency hold, and strong consistency is tacitly assumed.
• It is not clear whether strong consistency holds.
I strongly disagree with the first claim. Lewis’s (Reference Lewis1980) work is titled “A Subjectivist’s Guide to Objective Chance.” That is, chance is viewed as being objective, and credence can be associated to statements about objective chance. When the objective world contains both the events and their chances, credence is not an arbitrary probability measure, but the Principal Principle limits what a reasonable credence function may look like. Coming back to Lewis’s example we cited at the beginning, given the credences assigned to the various biases, it does not make sense to assign 50% credence to heads, but it has to be 62%. In other words, the Principal Principle is a consistency condition itself for the case when we assign credences to statements about objective chance. That is, there is only one ; for a subjectivist who believes in objective chance, statements about objective chance are in
to start with, and the Principal Principle is the consistency condition that we have to require for a credence function on
.
This view that I proposed in the previous paragraph I believe is also supported by the usual Bayesian analysis. The Bayesian prior, which corresponds to credence, is given not on the real outcomes such as heads and tails but instead on the set of probability measures (described by some parameter) over the real outcomes. In this set of probability measures, of course, events expressing that the probability of a chancy event A is r can be represented by the set of the form . That is, in Bayesian analysis as well, statements about the values of these probabilities are part of the event space from the beginning. Prior probabilities of the chancy events are computed from the prior measure over the probabilities similar to formula (1), just as Lewis did his computation.
However, if we insist that denotes only the chancy events and that it should be extended, then from the above, it is clear how to do it: we take all probability measures on
and define an event algebra on it that is fine enough to include all sets of the form
, where
and
. Let us call this event algebra over the probabilities
(this is the algebra of chances). We can define
(i.e., the algebra generated by elements of the form
, where
and
). But even in such a setup, an agent would not start from a credence function on
that he would want to extend to
. Instead, he would start fixing a credence function on the chances—or, with Bayesian language, he would first fix a Bayesian prior
on
—and would obtain the prior probability
on
using the usual formula corresponding to (1):

which is just the same as the expected value of the function with respect to the probability
.Footnote 2 Furthermore, the joint probabilities on
would also be the usual:Footnote 3

That is exactly what happens in Lewis’s example about the coin tosses: in that case can be taken to have two atomic elements, heads and tails, and
to have three atomic elements with nonzero credence (Bayesian prior):
• one putting 50% chance on heads with credence 27%.
• one putting 35% chance on heads with credence 22%.
• one putting 80% chance on heads with credence 51%.
Then the credence can be extended to giving 62% for heads and 38% for tails. It is not the 62% and the 38% that an agent postulates first and then finds a measure on
(which would correspond to the suggestion of Rédei and Gyenis), but the other way.
About the second point Rédei and Gyenis made, I do agree. It is not obvious, but still it turns out to be essentially true, which is the topic of the rest of this discussion.
4. Strong Consistency of the (Abstract) Principal Principle
Even if strong consistency does not seem to us necessary to be required in general, in certain cases it may be interesting to fix credences on the set of objective events and then try to extend it to a larger algebra that includes also events like . Such an extension was, for example, considered by Diaconis and Zabell (Reference Diaconis and Zabell1982, sec. 2.1) without the strong consistency condition.
Here we prove the following strong consistency theorem:
Theorem. Given any probability space and another probability measure
on
, if there is an
such that for all
,
or
(i.e., we can “pull”
under
), then strong consistency holds: there exists a probability space
and a Boolean algebra embedding h of
into
such that
i) For every
, there exists an
with the property
(8)ii) If
and
, then
.
iii) The probability space
is an extension of the probability space
with respect to h; that is, we have
(9)iv)
is stable: for all
, we have
(10)
Proof. Consider the set of all probability measures on : denote it by
(this is the set of all possible chances):

For an and an
, let us define

Take a fine enough σ-algebra on
such that for all
and
,
.
I claim that it is possible to find a measure (credence of chances) on
such that
for any
(in order to be able to take conditionals), and for all
,

where the summation is taken over all values r for which (it could also be defined as an integral over
; see n. 2). For a
, let
denote a Dirac measure concentrated on p. That is, for any
,
if and only if
, and
if and only if
. We prove that
can simply be taken to be of the form

where a and b are conveniently chosen constants and is a probability measure on
: we assumed that there is an
such that for all
,
or
. Clearly, the measure
is a positive measure on
, but as
, it is smaller than a probability measure. Let
, and let
. As
and
,
is a positive measure on
. As
, this is a probability measure on
and hence an element of
. Let

Since , this is also a probability measure. For any
,

where we used the fact that and that
if and only if
. We also have that for all
,

Here, note that for at most two different r’s—
and
—so there is no problem using the summation sign.
Let ). Here,
is the σ-algebra generated by elements of the form
, where
and
. Let the measure
on
be generated by

for all and
. Again, because of the way we defined
, for any
, there are at most two values of r (
and
) such that the summands in the above summation are nonzero, so it is not incorrect to keep using the summation instead of an integral.
For all , let

and

, and .
Clearly,

by the choice of .
Then, for , we have

as we have seen.
To see iv, fix , and for simplicity let
. By applying the above definitions, and as
,

because the numerator is only nonzero when , in which case the numerator and the denominator are equal. This gives us iv. For the special case when there is only one
, namely,
, we obtain the proof of i.
Finally, consider ii. In the construction so far, for ,
is actually not ensured. But it is easy to modify
so that this condition is ensured as well. Consider the disjoint union
, set
, and define
. This will not interfere with the probabilities, but including
ensures that
implies
. QED
Remark 2. I left the proof of the satisfiability of ii to the end, because I think that this condition, although easily satisfiable, should not be required. Set corresponds to the event “the probability of A is
,” while
corresponds to the event “the probability of B is
.” There is no reason for these propositions to necessarily denote different events in the probability space. For example, when a coin is tossed, “the probability of heads is 1/3” and “the probability of tails is 2/3” can reasonably correspond to the same set in a Kolmogorovian model, while heads and tails are different events.
Remark 3. Note, in the special case when , then
, no matter what a we chose.
If (and hence
) is finite, and if
is absolutely continuous with respect to
(i.e., for any
,
implies
), then for any

we have that for all ,
or
, and the conclusions of the theorem hold. Hence, we have
Corollary. If is finite, and if
is absolutely continuous with respect to
, then strong consistency holds.
Remark 4. Rédei and Gyenis also define a consistency property for “debugged” versions of the Principal Principle. I do not consider that case here for two reasons. One is that the handling of admissible evidence would raise a whole array of new issues. The other is that as argued in this article, strong consistency is not needed for the Principal Principle to make sense, and that is true even if we throw in admissible evidence, no matter whether we consider the original version or debugged versions.
Remark 5. I wrote that strong consistency “essentially” holds. I put it this way because of the above absolute continuity condition. Note, the absolute continuity of with respect to
is a necessary condition, because a zero
cannot be updated by new knowledge to nonzero probability. In the finite case, absolute continuity is sufficient to pull
below
; in the infinite case it is not sufficient. I in fact believe that the absolute continuity condition is sufficient to prove strong consistency even in the infinite case (without pulling the whole of
below
) but with a more complex construction for
. Proving that remains for future work.