Unfolding the Grammar of Bayesian Confirmation: Likelihood and Antilikelihood Principles

Roberto Festa; Gustavo Cevolani

doi:10.1086/688935

Unfolding the Grammar of Bayesian Confirmation: Likelihood and Antilikelihood Principles

Published online by Cambridge University Press: 01 January 2022

Roberto Festa and

Gustavo Cevolani

Article contents

Abstract
Introduction
Incremental Measures of Bayesian Confirmation
Strong and Weak Likelihood Principles for Bayesian Confirmation
Antilikelihood Principles for Bayesian Confirmation
Concluding Remarks
Footnotes
References

Rights & Permissions

Abstract

We explore the grammar of Bayesian confirmation by focusing on some likelihood principles, including the Weak Law of Likelihood. We show that none of the likelihood principles proposed so far is satisfied by all incremental measures of confirmation, and we argue that some of these measures indeed obey new, prima facie strange, antilikelihood principles. To prove this, we introduce a new measure that violates the Weak Law of Likelihood while satisfying a strong antilikelihood condition. We conclude by hinting at some relevant links between the likelihood principles considered here and other properties of Bayesian confirmation recently explored in the literature.

Type: Research Article
Information: Philosophy of Science , Volume 84 , Issue 1 , January 2017 , pp. 56 - 81

DOI: https://doi.org/10.1086/688935 [Opens in a new window]
Copyright: Copyright © The Philosophy of Science Association

1. Introduction

A central problem of formal epistemology and philosophy of science is explicating what does it mean, for a piece of evidence E, to confirm a hypothesis H (see, e.g., Crupi Reference Crupi and Zalta2015). Prominent accounts of confirmation define the degree to which E confirms or supports H in terms of the probabilistic relations between E and H. Accordingly, given a probability distribution p, several different measures of confirmation (or inductive support) C(H, E) can be defined: two well-known examples are the difference measure D(H, E) = p(H∣E) − p(H) proposed by Carnap (Reference Carnap1950/1962) and the ratio measure R(H, E) = p(H∣E)/p(H) introduced by Keynes (Reference Keynes1921). It is customary to say that C(H, E) is a Bayesian measure (of confirmation) when the probabilities occurring in its definition are epistemic probabilities, that is, express the degrees of belief of some (ideal) inquirer in the relevant propositions. For instance, D and R are usually construed as Bayesian measures, with the initial probability p(H) expressing the degree of belief in the truth of H of an inquirer who lacks any relevant (empirical) evidence, and the final probability p(H∣E) the inquirer’s degree of belief in H once evidence E is taken into account.

In this article, we focus on a specific class of Bayesian measures of confirmation, that is, so-called incremental measures. Such measures—like D and R above—are supposed to express how much learning E increases the probability of H. In other words, if C is an incremental measure, then C(H, E) expresses the probability increment occurring in the shift from p(H) to p(H∣E). Over the last 15 years, confirmation theorists have been exploring a plethora of incremental measures, grounded in significantly different intuitions concerning confirmation. The problem of assessing the relative adequacy of such measures has recently attracted increasing attention among philosophers of science (e.g., Festa Reference Festa, Galavotti and Pagnini1999, Reference Pagnini2012; Fitelson Reference Fitelson1999, Reference Fitelson2007; Kuipers Reference Kuipers2000; Zalabardo Reference Zalabardo2009; Crupi, Festa, and Buttasi Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010; Iranzo and Martínez de Lejarza Reference Iranzo and de Lejarza2012; Glass and McCartney Reference Glass and McCartney2014; Roche Reference Roche2014, Reference Roche2015a; Roche and Shogenji Reference Roche and Shogenji2014; Crupi Reference Crupi and Zalta2015). An important motivating issue for this line of inquiry is what Fitelson (Reference Fitelson1999) has called the problem of measure sensitivity—roughly, the fact that the soundness of many philosophical and methodological arguments surrounding the notion of confirmation crucially depends on the specific measure adopted to explicate this notion (see also Festa Reference Festa, Galavotti and Pagnini1999; Brössel Reference Brössel2013). The assumption underlying, more or less explicitly, the ongoing discussion is that an adequate incremental measure should exhibit, so to speak, an appropriate “grammar” (as Crupi et al. [Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010] put it), that is, a set of properties that formally express plausible intuitions about confirmation.

The grammar of incremental confirmation is the main topic of our article, too. In particular, we will consider some “likelihood principles” recently proposed in the literature as essential properties of any adequate incremental measure. A likelihood principle requires certain relationships to hold between C(H, E) and the likelihoods of H and ¬H with respect to E, that is, between C(H, E) and the probabilities p(E∣H) and p(E∣¬H) or, at least, between C(H, E) and one of those values. Such principles play a key role in current discussions of Bayesian confirmation, with “likelihoodist” theorists arguing that they are necessary—or even sufficient—to adequately define incremental measures.Footnote ¹ In this article, we aim at contributing to the ongoing discussion in this area by presenting some new results concerning the different likelihood principles currently on the market. More precisely, our main results are as follows: First, we offer a systematic survey of different likelihood principles (some of them new), characterize their content, and study the logical relations between them, which have so far remained often obscure or unnoticed in the literature. Second, we show that none of the likelihood principles proposed so far is satisfied by all incremental measures; in particular, the so-called Weak Law of Likelihood, which plays a prominent role in recent analyses of Bayesian confirmation, is violated by some incremental measures that are grounded in appealing core intuitions. Third, we argue that the above-mentioned likelihood principles have to be supplemented with new ones, including some prima facie very strange principles, that we call antilikelihood principles; in fact, quite surprisingly, some intuitively appealing incremental measures satisfy at least one of such antilikelihood principles. The upshot of our discussion is that some purportedly basic properties of confirmation are not so fundamental as they are widely believed to be, since some incremental measures violate them. This implies that the grammar of Bayesian confirmation is richer, and the notion of incremental measure more flexible, than previously thought.

Our discussion will proceed as follows. In section 2 we outline the grammar of Bayesian confirmation, stating the basic properties defining incremental measures. Section 3 introduces some (old and new) likelihood principles and explores their logical and conceptual relations. In section 4, we present a new measure of confirmation, which violates some widely accepted likelihoodist intuitions and satisfies a surprising antilikelihood principle. Section 5 concludes the article by discussing some prospective implications of our results for ongoing work on the grammar of Bayesian confirmation. The proofs of all theorems appear in the appendix.

2. Incremental Measures of Bayesian Confirmation

Here we summarize some more or less well-known properties of incremental measures of Bayesian confirmation. First, we introduce the qualitative notion of confirmation (sec. 2.1); then, we formally define the concept of incremental measure (sec. 2.2). We also introduce a distinction between “universal” properties of confirmation (characterizing all incremental measures) and “structural” properties (isolating specific classes of such measures), which will play a central role in the rest of the article.

Two points are worth noting before we start. First, in the literature the term “incremental confirmation” is usually employed in a rather loose way, so that different scholars end up attaching quite different meanings to it. Our definition of incremental confirmation will amount to a rigorous explication of a widely shared use of this term—roughly, an incremental measure is a function of p(H∣E) and p(H) that increases when p(H∣E) increases. Absent a fully general consensus on the meaning of “incremental confirmation,” however, our definition will exclude some confirmation measures that are occasionally labeled “incremental” in the literature.Footnote ²

Second, in order to ensure mathematical definiteness, we will focus on hypotheses H and pieces of evidence E with nonextreme probability values, that is, such that 0 < p(H), p(E) < 1.Footnote ³ The only exception will be the occasional reference to tautological (logically true) evidence; as far as notation is concerned, the tautology will be denoted, as usual, by ⊤.

2.1. Qualitative Confirmation

The starting point of the search for appropriate measures of confirmation is the notion of qualitative confirmation. The guiding intuition is that the specific confirmatory relation occurring between H and E depends on whether and how the initial probability of H is changed by learning E.Footnote ⁴ This qualitative notion of confirmation is usually defined as follows:

Qualitative Confirmation. For any H and E,

E confirms H if and only if p(H∣E) > p(H)	(confirmation in narrow sense);
E is neutral for H if and only if p(H∣E) = p(H)	(neutrality);
E disconfirms H if and only if p(H∣E) < p(H)	(disconfirmation).

The above threefold classification of the confirmatory relations between H and E should be construed, to use the statistical jargon, as a qualitative ordinal variable.Footnote ⁵ This means that the “intensity” of confirmation decreases in the shift from confirmation (in the narrow sense) to neutrality and from neutrality to disconfirmation. In other words, Qualitative Confirmation expresses an ordinal ranking (i.e., not a mere trichotomy) of different confirmation relations, which can then provide a foundation for an appropriate quantitative notion of confirmation. Many confirmation theorists would presumably agree that any such notion should be compatible with Qualitative Confirmation in the sense that it should extend the partial ordering just defined. We propose to formalize this requirement in terms of the following principle:

Compatibility (with qualitative confirmation). For any H1, H2 and E1, E2:

i) If E1 confirms H1 and E2 is neutral for H2, then $C (H 1, E 1) > C (H 2, E 2)$ ;
ii) If E1 is neutral for H1 and E2 disconfirms H2, then $C (H 1, E 1) > C (H 2, E 2)$ ;
iii) If E1 is neutral for H1 and E2 is neutral for H2, then $C (H 1, E 1) = C (H 2, E 2)$ .

As we will see in a moment, all incremental measures are indeed compatible with Qualitative Confirmation in the sense just specified; however, not all measures satisfying Compatibility need to be incremental.Footnote ⁶ Accordingly, Compatibility is too weak a requirement to isolate incremental measures; other principles are needed, which will now be discussed.

2.2. The Grammar of Incremental Measures of Confirmation

An incremental measure of confirmation C(H, E) may be informally described as a function of p(H∣E) and p(H) that increases when p(H∣E) increases. The grammar of incremental measures is currently being thoroughly investigated by various authors (see esp. Festa Reference Festa, Galavotti and Pagnini1999, Reference Pagnini2012; Fitelson Reference Fitelson1999, Reference Fitelson2007; Crupi et al. Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010; Brössel Reference Brössel2013; Hájek and Joyce Reference Hájek, Joyce, Psillos and Curd2013; Roche Reference Roche2014; Roche and Shogenji Reference Roche and Shogenji2014; Crupi Reference Crupi and Zalta2015). Following Crupi (Reference Crupi and Zalta2015), we formally characterize incremental measures by means of the following basic principles, or axioms:

P1. Formality. There exists a function f such that, for any H and E, $C (H, E) = f [p (H ∣ E), p (H), p (E)]$ .
P2. Tautological evidence. For any H1 and H2, $C (H 1, ⊤) = C (H 2, ⊤)$ .
P3. Final probability. For any H, E1, E2, if $p (H ∣ E 1) ⪌ p (H ∣ E 2)$ , then $C (H, E 1) ⪌ C (H, E 2)$ .

Some remarks about the intuitive meaning of P1–P3 will be useful.Footnote ⁷ P1 requires that C(H, E) depends only on p(H∣E), p(H), and p(E). Since this triple of probabilities entirely determines the probability distribution p over the algebra generated by H and E, P1 amounts to requiring that C(H, E) depends only on this distribution. P2 requires that all hypotheses receive the same degree of confirmation by the tautological evidence ⊤. Finally, P3 requires that the confirmation of a hypothesis is a strictly increasing function of its final probability.Footnote ⁸

By definition, all incremental measures satisfy the basic principles P1–P3 and all the other principles that can be logically derived from them (some examples follow in a moment). Such (basic and derived) principles isolate the properties characterizing all incremental measures and are thus labeled “universal” principles. Yet, certain incremental measures may exhibit further interesting properties, which are specified by corresponding “structural” principles. Such principles are logically independent of the basic principles (i.e., they are neither entailed by nor incompatible with P1–P3) and characterize specific (classes of) incremental measures. Two examples of universal principles and one of a structural principle are given below.

Our first example of a universal principle is Compatibility. Indeed, one can check that the basic principles P1–P3 jointly entail Compatibility, that is, that any incremental measure is compatible with Qualitative Confirmation. Our second example is the following consequence of the basic principles—actually of P3 alone (cf. Crupi et al. Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, theorem 3):

(IFPD) Initial and Final Probability Dependence. There exists a function g such that, for any H and E, $C (H, E) = g [p (H ∣ E), p (H)]$ .

The above principle requires that any incremental measure is uniquely determined by the initial and final probability of H, that is, that C(H, E) depends only on p(H∣E) and p(H). Note that IFPD is stronger than the basic principle P1 since the latter allows for the possibility that C(H, E) depends not only on p(H∣E) and p(H) but also on p(E).

It is worth noting that IFPD does not say anything on how C(H, E) should depend on the initial probability p(H). The reader familiar with traditional incremental measures like D and R might suspect that the relations between C(H, E) and p(H) should be ruled by the following condition:

(IP) Initial probability. For any H1, H2, and E if $p (H 1 ∣ E) = p (H 2 ∣ E)$ and $p (H 1) ⪋ p (H 2)$ , then $C (H 1, E) ⪌ C (H 2, E)$ .

Condition IP—which is, so to speak, the counterpart of principle P3 for final probability—requires that the confirmation of a hypothesis is a strictly decreasing function of its initial probability. As it is easy to check, most traditional incremental measures satisfy IP. For this reason, one might feel that IP is a universal principle of incremental confirmation. That this is not the case is shown by the fact that there are incremental measures that satisfy all the basic principles P1–P3 but still violate IP. One example is the measure p(H)[p(H∣E) − p(H)], introduced by Crupi et al. (Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, 89, theorem 1). Thus, IP provides our first example of a structural principle of Bayesian confirmation.Footnote ⁹

3. Strong and Weak Likelihood Principles for Bayesian Confirmation

As said in the introduction, likelihood principles require that certain relationships hold between C(H, E) and (one or both of) the likelihoods p(E∣H) and p(E∣¬H). In what follows, we consider one universal and five structural principles of this kind (through this section, unless stated otherwise, “principle(s)” will always mean “likelihood principle(s)”). To begin with, it is worth noting how the basic principle P1 can be reformulated, so to speak, in likelihoodist terms. To recall, P1 states that C(H, E) only depends on the probability distribution over the algebra generated by H and E. Such a distribution can be determined by different triples of probability values, [p(H∣E), p(H), p(E)] being only one of them. Any such triple can be used in P1 to make C(H, E) dependent on the relevant probabilities. For our purposes, the following logically equivalent reformulation of P1 will be useful:

(LF) Likelihood form. There exists a function k such that, for any H and E, $C (H, E) = k [p (E ∣ H), p (E ∣ \neg H), p (E)]$ .

LF says that C(H, E) can be expressed in “evidential terms,” that is, as a function of the initial probability p(E) of evidence E and of its conditional probabilities p(E∣H) and p(E∣¬H). The label for this principle is justified by noting that p(E) can be construed as the likelihood p(E∣⊤) of a tautological hypothesis. Accordingly, LF says that C(H, E) depends only on the three likelihoods p(E∣H), p(E∣¬H), and p(E).

Given the reformulation of P1 as LF, a new likelihood principle can be immediately stated as follows:

(EL) Equal Likelihoods. For any E, H1, H2, if $p (E ∣ H 1) = p (E ∣ H 2)$ and $p (E ∣ \neg H 1) = p (E ∣ \neg H 2)$ , then $C (H 1, E) = C (H 2, E)$ .

Not surprisingly, given LF one can immediately check that:

Theorem 1. EL is a universal principle.

Admittedly, EL is an extremely weak principle, indeed a nearly trivial one. Still, we submit that it is the only universal likelihood principle of incremental confirmation. Our conjecture is supported by the fact that, as we will see in a moment, all likelihood principles considered so far in the literature are indeed structural principles.

Confirmation theorists have been discussing several likelihood principles apart from LF and EL (which, by the way, have both remained often unnoticed or implicit in the literature). Below we consider five such principles and illustrate their intuitive meaning. Afterward, we point out the logical relations occurring among them. Finally, we show that each of these principles is, indeed, a structural principle (i.e., that it is satisfied by some incremental measures and violated by others).

The best known likelihood principle is probably the so-called Law of Likelihood (LL), which is stated as follows:Footnote ¹⁰

(LL) If $p (E ∣ H 1) ⪌ p (E ∣ H 2)$ , then $C (H 1, E) ⪌ C (H 2, E)$ .

This “law” expresses the intuition that E confirms H1 more than H2 just when H1 is better than H2 in predicting E (i.e., when E is more probable given H1 than given H2). Given LF, LL amounts to saying that C(H, E) does not depend on p(E∣¬H) and is a strictly increasing function of p(E∣H).

The above principle naturally suggests the following one, which, to the best of our knowledge, has never been considered in the literature:

(N) If $p (E ∣ \neg H 1) ⪋ p (E ∣ \neg H 2)$ , then $C (H 1, E) ⪌ C (H 2, E)$ .

The above principle can be seen as the “negative” counterpart of LL (hence the label). In fact, it is formally similar to LL in that it makes confirmation dependent only on one likelihood value; however, this is the likelihood of the negation of the relevant hypotheses, not of the hypotheses themselves as in the case of LL. More precisely, N expresses the intuition that E confirms H1 more than H2 just when ¬H1 is worse than ¬H2 in predicting E, that is, when E is less probable given ¬H1 than given ¬H2. Given LF, N amounts to saying that C(H, E) does not depend on p(E∣H) and that it is a strictly decreasing function of p(E∣¬H).

As anticipated in section 1, an important principle of Bayesian confirmation is the so-called Weak Law of Likelihood (WLL). Among the several versions of WLL introduced in the literature, we focus on the following version:Footnote ¹¹

(WLL) If $p (E ∣ H 1) > p (E ∣ H 2)$ and $p (E ∣ \neg H 1) C (H 2, E)$ .

The antecedent of WLL says that H1 predicts E “uniformly better” than H2 (this terminology is borrowed from Joyce [Reference Joyce and Zalta2008], sec. 3). This means that E is both more probable assuming that H1 is true than assuming that H2 is true and less probable assuming that H1 is false than assuming that H2 is false. Thus, WLL says that if H1 predicts E uniformly better than H2, then E confirms H1 more than H2. Given LF, WLL means that C(H, E) increases when the likelihood of H increases and the likelihood of ¬H decreases. Two structural likelihood principles closely related to WLL are as follows:

(WLL-L) If $p (E ∣ H 1) > p (E ∣ H 2)$ and $p (E ∣ \neg H 1) = p (E ∣ \neg H 2)$ , then $C (H 1, E) > C (H 2, E)$ ;

(WLL-N) If $p (E ∣ H 1) = p (E ∣ H 2)$ and $p (E ∣ \neg H 1) C (H 2, E)$ .

WLL-L expresses the intuition that if H1 is better than H2 in predicting E and ¬H1 is as good as ¬H2 in predicting E, then E confirms H1 more than H2. Analogously, WLL-N says that if H1 is as good as H2 in predicting E and ¬H1 is worse than ¬H2 in predicting E, then E confirms H1 more than H2. Given LF, WLL-L says that, for fixed values of $p (E ∣ \neg H)$ , $C (H, E)$ is a strictly increasing function of p(E∣H), while WLL-N states that, for fixed values of $p (E ∣ H)$ , $C (H, E)$ is a strictly decreasing function of p(E∣¬H).

The five principles just introduced illustrate different ways in which C(H, E) may depend on one or both of the likelihoods. It is easy to check that such principles are genuinely different from each other, in the sense that no one of them is logically equivalent to another. However, they are by no means logically independent. The following theorem clarifies the relevant logical relations among the five principles (see also fig. 1, where these relations are graphically presented):

Theorem 2. The following logical relations hold:

i) LL entails WLL.
ii) N entails WLL.
iii) WLL-L & WLL-N entails WLL.
iv) LL and N are incompatible.
v) LL entails WLL-L.
vi) N entails WLL-N.

Theorem 2 maps the logical space of a fairly representative family of likelihood principles. As anticipated, no one of them is universal: each is structural, meaning that it is satisfied by some incremental measures and violated by others. To prove this, one only needs to consider two further measures apart from the difference and the ratio measures, D and R, introduced in section 1. The former is the “odds counterpart” of D, where the initial and final odds of H are defined as usual as $o (H) = p (H) / [1 - p (H)]$ and $o (H ∣ E) = p (H ∣ E) / [1 - p (H ∣ E)]$ :Footnote ¹²

Odds difference. $OD (H, E) = o (H ∣ E) - o (H)$ .

The second, well-known measure was originally introduced by Gaifman (Reference Gaifman1979, 120):

Gaifman. $G (H, E) = [1 - p (H)] / [1 - p (H ∣ E)]$ .

Then one can check that all likelihood principles LL, N, WLL, WLL-L, and WLL-N are structural:

Theorem 3. Each of the likelihood principles LL, N, WLL, WLL-L, and WLL-N is satisfied by at least one of the incremental measures R, OD, and G and is violated by at least one of them.

Theorem 3 is not entirely surprising, since, for instance, LL is well known to be violated by many incremental measures, like D. Still, some other results above are probably much less expected. Let us consider, for instance, the fact that WLL is a structural principle (similar considerations apply to WLL-L and WLL-N). As many scholars have remarked, all traditional measures of confirmation in the literature do satisfy WLL (e.g., Fitelson Reference Fitelson2007; Joyce Reference Joyce and Zalta2008; Crupi et al. Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010; Roche and Shogenji 2014). Indeed, WLL appears to be such an undemanding principle that it has been often regarded as a very minimal adequacy condition for Bayesian confirmation.Footnote ¹³ In view of this, the fact that WLL is not a universal principle has significant implications for the grammar of Bayesian confirmation; in the next section, we explore some of them.

Figure 1. Logical relations among the five likelihood principles discussed in this article. Arrows represent logical entailment; dashed lines, logical incompatibility (⊥); dotted lines, logical independence; and solid lines, conjunction (&).

4. Antilikelihood Principles for Bayesian Confirmation

In the previous section we argued that no likelihood principle—with the exception of the very weak principle EL—holds universally, that is, for all incremental measures as defined in section 2.2. Up to this point, however, the reader may still think that this finding is not dramatically interesting for the grammar of Bayesian confirmation. After all, the fact that WLL is not a universal principle may depend on the specific properties of “strange” measures like OD. Below, we show that, in contrast with this idea, there are intuitively well-motivated incremental measures that not only violate WLL (and other traditional likelihood principles) but indeed satisfy some “antilikelihood principles” that express intuitions diametrically opposed to the ones underlying WLL.

4.1. Confirmation as Reduction of Improbability

At the beginning of this article, we presented the central intuition concerning Bayesian confirmation by saying that evidence E confirms hypothesis H when E increases the initial probability of H. Such intuition, however, may be also expressed as follows: E confirms H when E decreases the initial improbability of H. In this way, confirmation is construed not as increment of probability but as reduction of improbability. Accordingly, there are two ways of thinking about an incremental measure C(H, E). The first is the familiar one for which C(H, E) is a function of p(H) and p(H∣E) that increases when p(H∣E) increases. The second is as follows: given suitable definitions of the initial and final improbability of H—denoted by imp(H) and imp(H∣E), respectively—an incremental measure C(H, E) is a function of imp(H) and imp(H∣E) that increases when imp(H∣E) decreases.

The notion of confirmation as reduction of improbability may appear as just a logically equivalent, but more cumbersome, rendition of the traditional one. Still, we submit, there at least two reasons to consider the reduction-of-improbability intuition as a useful complement of the increment-of-probability intuition. First, the reduction-of-improbability intuition provides a new interpretation of some old incremental measures, where the new interpretation appears, in some cases, more appealing than that provided by the increment-of-probability intuition. Second, the reduction-of-improbability intuition is heuristically fruitful, in the sense that it immediately suggests new incremental measures that would be hardly thought of on the basis of the increment-of-probability intuition.

In what follows, we use the term “improbability” as a neutral one, to denote any measure that decreases when p(H) increases. To mention but three examples, the initial improbability of H may be defined as follows:

{imp}_{1} (H) = 1 - p (H) = p (\neg H);

{imp}_{2} (H) = \frac{1}{p (H)};

{imp}_{3} (H) = \frac{1}{o (H)} = o (\neg H) .

The final improbability of H, imp(H∣E), is defined in the same way, simply by substituting p(H∣E) and o(H∣E) for p(H) and o(H), respectively, in the definitions above.Footnote ¹⁴

On the basis of such definitions, nearly all traditional incremental measures of confirmation can be taken as benchmarks to define new measures in terms of reduction of improbability. For instance, D and R each suggests the following triples of improbability difference measures ID(H, E) and of improbability ratio measures IR(H, E):

{ID}_{1} (H, E) = {imp}_{1} (H) - {imp}_{1} (H ∣ E);

{ID}_{2} (H, E) = {imp}_{2} (H) - {imp}_{2} (H ∣ E);

{ID}_{3} (H, E) = {imp}_{3} (H) - {imp}_{3} (H ∣ E);

{IR}_{1} (H, E) = \frac{{imp}_{1} (H)}{{imp}_{1} (H ∣ E)};

{IR}_{2} (H, E) = \frac{{imp}_{2} (H)}{{imp}_{2} (H ∣ E)};

{IR}_{3} (H, E) = \frac{{imp}_{3} (H)}{{imp}_{3} (H ∣ E)} .

It is not difficult to check that three of the above measures (i.e., ID₁, IR₁, and IR₂) are indeed identical to measures that we already know from the foregoing sections. Moreover, a fourth measure, IR₃, turns out to be the same as the very well-known odds ratio measure of confirmation OR (famously advocated by Good Reference Good1950). Summing up:

Theorem 4. For any H and E:

{ID}_{1} (H, E) = D (H, E);

{IR}_{1} (H, E) = G (H, E);

{IR}_{2} (H, E) = R (H, E);

{IR}_{3} (H, E) = \frac{o (H ∣ E)}{o (H)} = OR (H, E) .

Thus, the reduction-of-improbability intuition leads, on the one hand, to a new interpretation of the four traditional incremental measures D, G, R, and OR. On the other hand, it also suggests two new measures, ID₂ and ID₃, which, given the definitions of imp₂(H) and imp₃(H), can be expressed as follows:

{ID}_{2} (H, E) = \frac{1}{p (H)} - \frac{1}{p (H ∣ E)};

{ID}_{3} (H, E) = \frac{1}{o (H)} - \frac{1}{o (H ∣ E)} = o (\neg H) - o (\neg H ∣ E) .

Note that both ID₂ and ID₃ express confirmation as the difference between the initial and final improbability of H, where improbability is defined as the inverse of either the relevant probabilities (for ID₂) or the corresponding odds (for ID₃). As we will see in a moment, measures ID₂ and ID₃ are genuinely new incremental measures of confirmation and display some interesting properties, especially as far as likelihood principles are concerned.Footnote ¹⁵

4.2. The Normalized Difference Measure

To begin with, it is easy to check that the two measures ID₂ and ID₃ introduced in the previous section are indeed incremental measures (i.e., satisfy P1–P3 from sec. 2). Also, it is not difficult to see that they can be reformulated as follows:

{ID}_{2} (H, E) = \frac{p (H ∣ E) - p (H)}{p (H ∣ E) p (H)};

{ID}_{3} (H, E) = \frac{o (H ∣ E) - o (H)}{o (H ∣ E) o (H)} .

The above reformulation shows how both ID₂ and ID₃ can be construed, more traditionally, as increment-of-probability measures. In fact, ID₂ is the probability difference D normalized by the factor 1/[p(H∣E)p(H)], while ID₃ is the normalization of the odds difference OD by the factor 1/[o(H∣E)o(H)].

What is perhaps less obvious to note is that:

Theorem 5. For any H and E, ID₂(H, E) = ID₃(H, E).

Thus, not only do ID₂ and ID₃ have the same form: they are indeed the same measure, which is invariant if expressed either as the normalized difference of the relevant probabilities or as the normalized difference of the corresponding odds. For this reason, we refer to this new measure as to the “normalized (probability or odds) difference” measure, in symbols ND(H, E).

The ND measure has a number of interesting properties. For instance, consider the following, prima facie highly counterintuitive, principle:

(RWLL-N) If $p (E ∣ H 1) = p (E ∣ H 2)$ and $p (E ∣ \neg H 1) > p (E ∣ \neg H 2)$ , then $C (H 1, E) > C (H 2, E)$ .

Note that this principle is, so to speak, the reversal of principle WLL-N from section 3, in the sense that it is obtained from WLL-N by reversing the inequality sign in the second part of the antecedent (hence the label, which stands for “reversed WLL-N”).

Accordingly, RWLL-N says that if H1 is as good as H2 in predicting E and ¬H1 is better than ¬H2 in predicting E, then E confirms H1 more than H2. Given LF, RWLL-N amounts to saying that, for fixed values of p(E∣H), C(H, E) is a increasing function of p(E∣¬H). For this reason, RWLL-N can be called an antilikelihood principle of Bayesian confirmation.

Of course, RWLL-N is a structural principle, since it is violated by many measures of confirmation, including all the traditional ones. That this is the case can be seen by observing that:

Theorem 6. RWLL-N is incompatible with WLL-N.

Thus, all measures satisfying WLL-N will violate RWLL-N. Still, one can prove that:

Theorem 7. ND satisfies RWLL-N.

It immediately follows, due to the above illustrated incompatibility between RWLL-N and WLL-N, that:

Theorem 8. ND violates WLL-N.

Moreover, one can prove that ND also violates the Weak Law of Likelihood:

Theorem 9. ND violates WLL.

To sum up, we have shown that there is an intuitively motivated measure of confirmation as reduction of improbability, namely, ND, that violates WLL (and hence LL and N) and satisfies a strongly antilikelihood principle, namely, RWLL-N.

5. Concluding Remarks

We have presented and discussed some more or less well-known likelihood principles for Bayesian confirmation. Given a very plausible characterization of incremental measures of Bayesian confirmation, all those principles (with the exception of EL) turned out to be independent of the basic conditions defining such measures. This implies that, as we have proven, each of those principles is satisfied by some incremental measures and violated by others. In turn, this means that none of the likelihood principles considered here can be taken as isolating a fundamental property of Bayesian confirmation. This is particularly true for the Weak Law of Likelihood, which is often regarded as a crucial ingredient of a Bayesian treatment of confirmation. To further show that this is not the case, we focused on the normalized difference measure ND, which “strongly” violates WLL and other likelihood conditions by satisfying what we called an antilikelihood principle of confirmation. Moreover, we showed that measure ND is both intuitively motivated (by construing confirmation as reduction of improbability) and formally equivalent to a normalization of such a traditional measure as D. For these reasons, it seems to us, little doubt exists that ND is a perfectly acceptable measure of confirmation.

One way to resist this conclusion would be as follows.Footnote ¹⁶ Investigating the grammar of confirmation should aim at identifying intuitively plausible conditions on incremental measures. Now, one may argue, WLL clearly is one such condition, while RWLL-N is not. Accordingly, all measures that, like OD, fail to meet WLL should be rejected as inadequate explications of confirmation, and this is the case, a fortiori, for measures that, like ND, strongly violate WLL by meeting RWLL-N.

Here, we can just sketch two answers to this kind of objection. First, as a decades-long discussion on competing confirmation measures has shown, in this context intuitions are at least sometimes unreliable and typically insufficient as a guide to the choice of specific measures or conditions. For this reason, in the current article we aimed at a general characterization of (anti)likelihood principles, without committing ourselves from the beginning to one or more of them. That intuitively assessing the plausibility of different principles (and hence measures) may be problematic is shown by the following example. Which one is the most plausible likelihood principle among WLL, WLL-L, and WLL-N? Different answers to this question will exclude as inadequate different measures; for instance, G has to be excluded if one prefers WLL-L over WLL-N, while R has to go if the preference is reversed (see the appendix for relevant proofs). Still, given that all three conditions above are structural (nonuniversal) principles, it is far from obvious on which grounds the choice should be made. In short, it seems to us that, in light of theorem 3, the plausibility of WLL (or of any other structural principle) can be hardly taken for granted without providing some independent justification in its favor. (In this connection, it may be instructive to compare the task of justifying a universal principle like EL with that of motivating a structural one like WLL.)

Second, reasons to doubt the indisputableness of the intuitions underlying WLL and related principles are also suggested by current discussion of other adequacy requirements for confirmation measures apart from likelihood principles. As an example, let us consider the so-called Matthew principles for Bayesian confirmation studied by Festa (Reference Pagnini2012), Roche (Reference Roche2014), and Festa and Cevolani (Reference Festa and Cevolani2015). Two such principles read as follows:Footnote ¹⁷

(MP) If $p (E ∣ H 1) = p (E ∣ H 2) > p (E)$ and $p (H 1) > p (H 2)$ , then $C (H 1, E) > C (H 2, E)$ ;

(RMP) If $p (E ∣ H 1) = p (E ∣ H 2) > p (E)$ and $p (H 1) > p (H 2)$ , then $C (H 1, E) < C (H 2, E)$ .

Both of the above principles concern two hypotheses H1 and H2 having different initial probability and such that H1 and H2 are “equally successful” in predicting E, in the sense that the likelihood of H1 and H2 on E is the same, and both hypotheses make the initial probability of E increase. Under these conditions, principle MP prescribes that an initially more probable hypothesis is more confirmed by its successes than a less probable one, while, according to RMP, an initially more probable hypothesis has to be less confirmed by its successes than a less probable one. Note that this latter condition has a distinctive Popperian flavor, leading one to prefer, in terms of confirmation, improbable (and, in this sense, highly informative) hypotheses over more probable ones.Footnote ¹⁸

Not surprisingly, one can prove that different incremental measures satisfy MP while violating RMP and vice versa (see Festa [Reference Pagnini2012, sec. 3] and Festa and Cevolani [Reference Festa and Cevolani2015] for a systematic analysis). For instance, as far as measure ND is concerned, we prove in the appendix that it can be expressed in the following equivalent form:

ND (H, E) = \frac{p (E ∣ H) - p (E)}{p (H) p (E ∣ H)} .

From this formulation, it is immediately clear that, for fixed E, ND decreases as p(H) increases. It follows that, if p(E∣H1) = p(E∣H2), an initially more probable hypothesis is less confirmed by E than an initially less probable one. In other words, ND meets RMP and violates MP (cf. Festa Reference Pagnini2012, theorem 2.i; Roche Reference Roche2014, theorem 2*.a).

In sum, on the one hand measure ND meets both RWLL-N and RMP, and on the other hand it violates WLL, WLL-N, and MP. This fact alone suggests that intimate relations may obtain between (anti)likelihood principles and Matthew principles. Elsewhere (Festa and Cevolani Reference Festa and Cevolani2015), we have shown that this is in fact the case and that some of the above principles are provably equivalent. Such equivalence results imply, in particular, that it is impossible to satisfy the intuitions underlying the weak laws of likelihood and, at the same time, a Popperian preference for improbable hypotheses as embodied in principle RMP.

In turn, this means that any argument in favor of RMP will immediately translate into an argument against WLL and other likelihood principles. Such an argument would thus provide an indirect justification of antilikelihood principles like RWLL-N. Indeed, a defense of RMP as a plausible principle of confirmation has been recently put forward by Festa (Reference Pagnini2012, sec. 3.3.2). If convincing, this argument would prove the intuitive plausibility of selected antilikelihood principles. In the end, it may well be that, as Roche (Reference Roche2015b, sec. 3.2 n. 17; italics added) puts it, “there is a sense of support [i.e., confirmation] on which any adequate support measure [i.e., confirmation measure] should fail to meet WLL.” Further research is needed to assess these preliminary results, which promise to open the way to a fruitful investigation of the deeper structure of Bayesian confirmation.

Appendix Proofs

We prove all the results presented in the article, except for those already proven elsewhere, for which we provided the relevant references in the text.

Theorem 1

EL is a universal principle. According to LF, C(H, E) can be expressed as a function of $p (E ∣ H)$ , $p (E ∣ \neg H)$ , and p(E). It immediately follows that, for any E, if $p (E ∣ H1) = p (E ∣ H2)$ and $p (E ∣ \neg H1) = p (E ∣ \neg H2)$ , then $C (H 1 ∣ E) = C (H 2 ∣ E)$ . In other words, LF entails EL, which is then a universal principle given that LF is equivalent to P1, which is a basic principle.

Theorem 2

For what concerns the logical relationships among the likelihood principles LL, N, WLL, WLL-L, and WLL-N introduced in section 3, we need to check the following cases.

LL implies WLL. Assume that WLL is false. This means that there are H1, H2, and E such that $p (E ∣ H 1) > p (E ∣ H 2)$ while $C (H 1 ∣ E) \leq C (H 2 ∣ E)$ , so that LL is violated. By contraposition, if LL holds, then also WLL holds.

N implies WLL. Assume that WLL is false. This means that there are H1, H2, and E such that $p (E ∣ \neg H1) < p (E ∣ \neg H2)$ while $C (H1 ∣ E) \leq C (H 2, E)$ , so that N is violated. By contraposition, if N holds, then also WLL holds.

The conjunction of WLL-L and WLL-N implies WLL. We prove that if WLL-L, WLL-N, and the antecedent of WLL (i.e., $p (E ∣ H 1) > p (E ∣ H 2)$ and $p (E ∣ \neg H 1) p (E ∣ H 2)$ and $p (E ∣ \neg H 1) = p (E ∣ \neg H 3)$ , which, by WLL-L, imply $C (H 1, E) > C (H 3, E)$ . Moreover, we have both $p (E ∣ H 2) = p (E ∣ H 3)$ and $p (E ∣ H \neg 3) C (H 2, E)$ . Finally, from $C (H 1, E) > C (H 3, E)$ and $C (H 3, E) > C (H 2, E)$ we conclude, by transitivity, that $C (H1 ∣ E) > C (H 2, E)$ , which is the consequent of WLL. Thus, WLL-L and WLL-N together imply WLL.

LL and N are incompatible. We prove that the conjunction of LL and N leads to a contradiction. It is sufficient to consider two hypotheses H1 and H2 such that $p (E ∣ H 1) > p (E ∣ H 2)$ and $p (E ∣ \neg H 1) > p (E ∣ \neg H 2)$ . Then it follows both that $C (H1 ∣ E) > C (H 2, E)$ from LL and that $C (H1 ∣ E) < C (H 2, E)$ from N, which is impossible.

LL implies WLL-L. Assume that WLL-L is false. This means that there are H1, H2, and E such that $p (E ∣ H 1) > p (E ∣ H 2)$ while $C (H1 ∣ E) \leq C (H 2, E)$ , so that LL is violated. By contraposition, if LL holds, then also WLL-L holds.

N implies WLL-N. Assume that WLL-N is false. This means that there are H1, H2, and E such that $p (E ∣ \neg H 1) < p (E ∣ \neg H 2)$ while $C (H1 ∣ E) \leq C (H 2, E)$ , so that N is violated. By contraposition, if N holds, then also WLL holds.

Theorem 3

We need to prove that all likelihood principles LL, N, WLL, WLL-L, and WLL-N are structural, that is, that each of them is satisfied by (at least) an incremental measure and violated by some other one. To this purpose, it is useful to consider the relevant measures expressed in “likelihood form,” that is, as functions of p(E∣H), p(E∣¬H), and p(E) only. In this way, it becomes immediately evident, for each of the five principles, which measures satisfy or violate it. The following lemma is easily proven:

Lemma 1. For any H and E:

R (H, E) = \frac{p (E ∣ H)}{p (E)};

G (H, E) = \frac{p (E)}{p (E ∣ \neg H)} .

Proof. As far as R is concerned: R(H, E) is defined as p(H∣E)/p(H), which, by Bayes’s theorem, is equal to $[p (H) p (E ∣ H) / p (H)] / p (E) = p (E ∣ H) / p (E)$ . For G, note that $G (H, E) = [1 - p (H)] / [1 - p (H ∣ E)]$ amounts to $p (\neg H) / p (\neg H ∣ E)$ , which, again by Bayes’s theorem, is equal to $p (\neg H) / [p (\neg H) p (E ∣ \neg H) / p (E)] = 1 / [p (E ∣ \neg H) / p (E)] = p (E) / p (E ∣ \neg H)$ . QED

The two measures above are already sufficient to prove that four of our five principles (i.e., LL, N, WLL-L, and WLL-N) are structural. In fact, by inspection of the likelihood form of R and G above, one can easily check that:

LL is a structural principle. LL is satisfied by R and violated by G, since it is easy to see that, for any E, $p (E ∣ H 1) ⪌ p (E ∣ H 2)$ implies $R (H 1, E) ⪌ R (H 2, E)$ but not necessarily $G (H 1, E) ⪌ G (H 2, E)$ .

N is a structural principle. N is satisfied by G and violated by R, since it is easy to see that, for any E, $p (E ∣ \neg H 1) ⪋ p (E ∣ \neg H 2)$ implies $G (H 1, E) ⪌ G (H 2, E)$ but not necessarily $R (H 1, E) ⪌ R (H 2, E)$ .

WLL-L is a structural principle. WLL-L is satisfied by R and violated by G, since it is easy to see that, for any E, $p (E ∣ H 1) > p (E ∣ H 2)$ and $p (E ∣ \neg H 1) = p (E ∣ \neg H 2)$ implies $R (H 1, E) > R (H 2, E)$ but $G (H 1, E) = G (H 2, E)$ .

WLL-N is a structural principle. WLL-N is satisfied by G and violated by R, since it is easy to see that, for any E, $p (E ∣ H 1) = p (E ∣ H 2)$ and $p (E ∣ \neg H 1) G (H 2, E)$ but $R (H 1, E) = R (H 2, E)$ .

WLL is a structural principle. As far as WLL is concerned, it is satisfied by all the three measures above (and by many others, including D), as it is easily seen again by inspection of their likelihood forms (see also Roche and Shogenji [Reference Roche and Shogenji2014, 119–20], for relevant discussion and proofs). The following counterexample shows, however, that measure OD violates WLL. Consider the following probability distribution over statements E, H1, H2: p(H1 & H2 & E) = 0.15, p(H1 & H2 & ¬E) = 0.05, p(H1 & ¬H2 & E) = 0.10, p(H1 & ¬H2 & ¬E) = 0.02, p(¬H1 & H2 & E) = 0.15, p(¬H1 & H2 & ¬E) = 0.18, p(¬H1 & ¬H2 & E) = 0.05; p(¬H1 & ¬H2 & ¬E) = 0.30. It can then be computed that p(E∣H1) = 0.78 > 0.57 = p(E∣H2) and p(E∣¬H1) = 0.29 < 0.32 = p(E∣¬H2) but OD(H1, E) = 0.78 < 0.87 = OD(H2, E), contrary to WLL, which would require OD(H1, E) > OD(H2, E).

This completes the proofs concerning the five likelihood principles of figure 1 from section 3. The following results concern the measures of confirmation as reduction of improbability from section 4.

Theorem 4

Given the definitions of the improbability measures imp₁(H), imp₂(H), and imp₃(H), the following equalities are easily derived:

{ID}_{1} (H, E) = {imp}_{1} (H) - {imp}_{1} (H ∣ E) = 1 - p (H) - [1 - p (H ∣ E)] = p (H ∣ E) - p (H) = D (H, E);

{ID}_{2} (H, E) = {imp}_{2} (H) - {imp}_{2} (H ∣ E) = \frac{1}{p (H)} - \frac{1}{p (H ∣ E)};

{ID}_{3} (H, E) = {imp}_{3} (H) - {imp}_{3} (H ∣ E) = \frac{1}{o (H)} - \frac{1}{o (H ∣ E)} = \frac{p (\neg H)}{p (H)} - \frac{p (\neg H ∣ E)}{p (H ∣ E)} = o (\neg H) - o (\neg H ∣ E);

{IR}_{1} (H, E) = \frac{{imp}_{1} (H)}{{imp}_{1} (H ∣ E)} = \frac{1 - p (H)}{1 - p (H ∣ E)} = G (H, E);

{IR}_{2} (H, E) = \frac{{imp}_{2} (H)}{{imp}_{2} (H ∣ E)} = \frac{1 / p (H)}{1 / p (H ∣ E)} = R (H, E);

{IR}_{3} (H, E) = \frac{{imp}_{3} (H)}{{imp}_{3} (H ∣ E)} = \frac{1 / o (H)}{1 / o (H ∣ E)} = \frac{o (H, E)}{o (H)} = OR (H, E) .

Theorem 5

As for measures ID₂ and ID₃ above, it is immediate to check that

{ID}_{2} (H, E) = \frac{1}{p (H)} - \frac{1}{p (H ∣ E)} = \frac{p (H ∣ E) - p (H)}{p (H ∣ E) p (H)};

{ID}_{3} (H, E) = \frac{1}{o (H)} - \frac{1}{o (H ∣ E)} = \frac{o (H ∣ E) - o (H)}{o (H ∣ E) o (H)} .

Moreover, recalling that $o (H) = p (H) / 1 - p (H)$ and $o (H ∣ E) = p (H ∣ E) / 1 - p (H ∣ E)$ , one can easily prove that ID₂ and ID₃ are the same measure. In fact, $o (H ∣ E) - o (H) = {p (H ∣ E) [1 - p (H)] - p (H) [1 - p (H)]} / {[1 - p (H)] [1 - p (H ∣ E)]} = [p (H ∣ E) - p (H)] / {[1 - p (H)] [1 - p (H ∣ E)]}$ . But, $o (H ∣ E) o (H) = [p (H) p (H ∣ E)] / {[1 - p (H)] [1 - p (H ∣ E)]}$ . Hence, ${ID}_{3} = [o (H ∣ E) - o (H)] / [o (H ∣ E) o (H)] = [p (H ∣ E) - p (H)] / [p (H) p (H ∣ E)] = {ID}_{2}$ .

To see that ID₂ (or ID₃) is an incremental measure, it is sufficient to check that ${ID}_{2} (H, E) = 1 / p (H) - 1 / p (H ∣ E)$ satisfies P1 (by definition), P2 (since ID₂(H1, ⊤) = 0 for any H), and P3 (since, for any given H, ID₂(H, E) increases as p(H∣E) increases). So ID₂ is an incremental measure, as well as ID₃.

Theorem 6

To prove that WLL-N and RWLL-N are incompatible, we show that their conjunction leads to a contradiction. It is sufficient to consider two hypotheses H1 and H2 such that $p (E ∣ H1) = p (E ∣ H2)$ and $p (E ∣ \neg H 1) C (H 2, E)$ from WLL-N and that $C (H 2, E) > C (H 1 ∣ E)$ from RWLL-N, which is impossible.

It remains to prove that measure ND meets RWLL-N and violates WLL-N and WLL; the following lemmas will be useful in proof.

Lemma 2. For any H and E such that E is not neutral for H:

i) $p (H) = [p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]$ ;
ii) $p (H ∣ E) = {[p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]} \times [p (E ∣ H) / p (E)]$ .

Proof. (i) The proof starts from the “law of total probability” according to which $p (E) = p (H) p (E ∣ H) + p (\neg H) p (E ∣ \neg H)$ . It then follows that $p (E) = p (H) p (E ∣ H) + [1 - p (H)] p (E ∣ \neg H) = p (H) p (E ∣ H) + p (E ∣ \neg H) - p (H) p (E ∣ \neg H) = p (H) [p (E ∣ H) - p (E ∣ \neg H)] + p (E ∣ \neg H)$ . This latter equality implies $p (H) = [p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]$ . (ii) As far as p(H∣E) is concerned, from Bayes’s theorem we have $p (H ∣ E) = p (H) [p (E ∣ H) / p (E)]$ and from the equality just obtained above we immediately derive that $p (H ∣ E) = {[p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]} \times [p (E ∣ H) / p (E)]$ . QED

Lemma 2 allows us to study how, for fixed values of p(E) and $p (E ∣ H)$ , $p (H) = [p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]$ varies with respect to $p (E ∣ \neg H)$ :

Lemma 3. For any H and E such that E is not neutral for H, for fixed values of p(E) and p(E∣H):

i) if E confirms H, p(H) decreases as p(E∣¬H) increases;
ii) if E disconfirms H, p(H) increases as p(E∣¬H) increases.

Proof. From lemma 2, we know that $p (H) = [p (E) - p (E ∣ \neg H)] / [p (E ∣ H) - p (E ∣ \neg H)]$ when E is not neutral for H. We rewrite p(H) as a function of variables $x = p (E)$ , $y = p (E ∣ H)$ , and $z = p (E ∣ \neg H)$ as follows:

p (H) = f (x, y, z) = \frac{x - z}{y - z} .

We then study how f varies as z increases. To this purpose, we calculate the partial derivative of f with respect to z by applying the basic (quotient and difference) rules of the calculus

\begin{matrix} f_{z} (\frac{x - z}{y - z}) & = \frac{f_{z} (x - z) \times (y - z) - (x - z) \times f_{z} (y - z)}{{(y - z)}^{2}} \\ = \frac{(f_{z} (x) - f_{z} (z)) \times (y - z) - (x - z) \times (f_{z} (y) - f_{z} (z))}{{(y - z)}^{2}} \\ = \frac{(- 1) \times (y - z) - (x - z) \times (- 1)}{{(y - z)}^{2}} \\ = \frac{(x - y)}{{(y - z)}^{2}} . \end{matrix}

Since the denominator of the above equation is always positive, the partial derivative of p(H) has the same sign as the numerator (x − y). Recalling that x = p(E) and y = p(E∣H), we thus obtain (i) if E confirms H, then $p (E ∣ H) = y > x = p (E)$ and hence the partial derivative of p(H) is negative; (ii) if E disconfirms H, then $p (E ∣ H) = y < x = p (E)$ and hence the partial derivative of p(H) is positive. In sum: for fixed values of p(E) and p(E∣H), p(H) decreases as p(E∣¬H) increases, if E confirms H, and p(H) increases as p(E∣¬H) increases, if E disconfirms H. QED

Coming now back to ND, we note that, for any H and E, ND(H, E) can be expressed as a function of p(E), p(E∣H), and p(H):

ND (H, E) = \frac{1}{p (H)} - \frac{1}{p (H ∣ E)} = \frac{1}{p (H)} - \frac{p (E)}{p (H) p (H ∣ E)} = \frac{p (E ∣ H) - p (E)}{p (H) p (E ∣ H)} .

Moreover:

Lemma 4. For fixed values of p(E) and p(E∣H), ND is an increasing function of p(E∣¬H).

Proof. Given lemma 3, we can distinguish two cases. If E confirms H, p(E∣H) − p(E) and hence ND(H, E) is positive; moreover, as p(E∣¬H) increases, p(H) decreases and hence ND(H, E) increases. If E disconfirms H, p(E∣H) − p(E) and hence ND(H, E) is negative; moreover, as p(E∣¬H) increases, p(H) increases, the absolute value of ND(H, E) decreases, and hence ND(H, E) increases. In sum, for fixed values of p(E) and p(E∣H), ND(H, E) is an increasing function of p(E∣¬H). QED

Given the above results, we can then prove the following results about ND.

Theorem 7

ND meets RWLL-N. From lemma 4, we know that, for fixed values of p(E) and p(E∣H), ND increases as p(E∣¬H) increases. It follows that p(E∣H1) = p(E∣H2) and p(E∣¬H1) > p(E∣¬H2) imply ND(H1, E) > ND(H2, E), so that RWLL-N is satisfied.

Theorem 8

ND violates WLL-N. This follows immediately from theorems 6 and 7, since ND meets RWLL-N, which is incompatible with WLL-N.

Theorem 9

ND violates WLL. A counterexample will be sufficient to prove this. Consider the following probability distribution over statements E, H1, H2: p(H1 & H2 & E) = 0.03, p(H1 & H2 & ¬E) = 0.03, p(H1 & ¬H2 & E) = 0.10, p(H1 & ¬H2 & ¬E) = 0.15, p(¬H1 & H2 & E) = 0.01, p(¬H1 & H2 & ¬E) = 0.03, p(¬H1 & ¬H2 & E) = 0.25; p(¬H1 & ¬H2 & ¬E) = 0.40. It can then be computed that p(E∣H1) ≃ 0.42 > 0.4 = p(E∣H2) and p(E∣¬H1) ≃ 0.38 < 0.39 ≃ p(E∣¬H2), but ND(H1, E) ≃ 0.23 < 0.25 = ND(H2, E), contrary to WLL, which would require ND(H1, E) > ND(H2, E).

Footnotes

†

We would like to thank Vincenzo Crupi, Theo Kuipers, and Luca Tambolo for helpful comments on a previous draft of this article. Financial support from the PRIN grant Models and Inferences in Science (20122T3PTZ), from the FIRB project Structures and Dynamics of Knowledge and Cognition (Turin unit: D11J12000470001), and from the University of Turin and the Compagnia San Paolo project Assessing Information Models: Exploring Theories and Applications of Optimal Information Search (D16D15000190005) is gratefully acknowledged.

1. See Edwards (Reference Edwards1972) and Royall (Reference Royall1997) for classical statements of likelihoodism and Sober (Reference Sober and Savage1990, Reference Savage2008), Fitelson (Reference Fitelson2007), and Joyce (Reference Joyce and Zalta2008, sec. 3) for relevant discussion.

2. We thank an anonymous reviewer for pressing us to make this point explicit. The reviewer also correctly notes that what we call “universal” properties of confirmation, i.e., the properties characterizing all incremental measures in our sense, are not necessarily satisfied by all measures called “incremental” in the literature.

3. This is a widespread restriction in the literature on incremental confirmation (e.g., Hájek and Joyce Reference Hájek, Joyce, Psillos and Curd2013, 119ff.; Crupi Reference Crupi and Zalta2015, sec. 3.1). Concerning hypotheses with zero initial probability, one reason to ignore them is that, if p(H) is 0 then, for any E, also p(H∣E) is 0 (if it is defined at all). Hence, according to the most common understanding of incremental confirmation mentioned above, H would be equally confirmed by any evidence whatsoever. A number of scholars, including an anonymous reviewer, assume that many scientific hypotheses, such as universal generalizations and point estimates, have zero initial probability; if this is true, then ignoring the case p(H) = 0 is at least problematic. This well-known limitation of the standard notion of incremental confirmation has raised a great deal of discussion (see, e.g., Festa Reference Festa, Galavotti and Pagnini1999; Kuipers Reference Kuipers2000, 44ff.); an assessment of this issue is, however, beyond the scope of this article.

4. This is the central intuition underlying the “relevance” notion of confirmation, i.e., of the view of confirmation as “increase in firmness” as distinguished from confirmation as “firmness” (in the terminology of Carnap Reference Carnap1950/1962, xvi; see also Salmon Reference Salmon1975). Following the main thread of contemporary discussions on Bayesian confirmation, we focus here on relevance confirmation and ignore confirmation as firmness (see, e.g., Crupi Reference Crupi and Zalta2015, sec. 3).

5. According to Hildebrand, Laing, and Rosenthal (Reference Hildebrand, Laing and Rosenthal1977, 8) a qualitative ordinal variable is a kind of qualitative variable with “a set of mutually exclusive states [that] are ordered or ranked in terms of the alternative amounts or degrees of intensity that the states represent.”

6. Two examples are given below, see n. 8.

7. With minor differences, Formality, Tautological evidence, and Final probability appear as postulates P0, P3, and P1, respectively, of Crupi (Reference Crupi and Zalta2015, sec. 3). A slightly different axiomatic characterization of incremental measures appears in Crupi et al. (Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, sec. 7.3).

8. While quite undemanding, P1–P3 are already sufficient to exclude some well-known measures of Bayesian confirmation (cf. Crupi et al. Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, 81; Roche Reference Roche2014, n. 2). To mention but two examples, the measure p(E∣H)−p(E), defended by both Mortimer (Reference Mortimer1988) and Kuipers (Reference Kuipers2000, 50), and the measure p(E∣H)–p(E∣¬H) proposed by Nozick (Reference Nozick1981, 252) are not incremental measures in our sense, since (as it is easy to check) they satisfy P1 and P2 but violate P3. But, both these measures (as well as other nonincremental measures) do satisfy Compatibility and hence are, at least in this sense, adequate quantitative counterparts of the qualitative notion of confirmation.

9. Of course, one might very well assume IP as one of the basic (and hence universal) principles for incremental measures. This is the choice, e.g., of Crupi et al. (Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010) and Festa (Reference Pagnini2012), where this principle appears, in a slightly different form, under the label Initial Probability Incrementality (IPI).

10. The label “law of likelihood” comes from Hacking (Reference Hacking1965); further references are to be found in Crupi, Chater, and Tentori (Reference Crupi, Chater and Tentori2013, 193). For discussions of this principle, see, e.g., Fitelson (Reference Fitelson2007), Joyce (Reference Joyce and Zalta2008, sec. 3), and Crupi (Reference Crupi and Zalta2015, sec. 3.4).

11. See, e.g., Roche and Shogenji (Reference Roche and Shogenji2014, 119). In the literature, WLL often appears in slightly different, and stronger, forms than the one above (see, e.g., Fitelson Reference Fitelson2007; Joyce Reference Joyce and Zalta2008, sec. 3). Roche and Shogenji (Reference Roche and Shogenji2014) compare different forms of this principle; their discussion partially overlaps with ours below.

12. While not so well known as an incremental measure, OD has appeared before in the literature, e.g., in Festa (Reference Festa, Galavotti and Pagnini1999, 59), Joyce (Reference Joyce and Zalta2008, sec. 2, table 1), Crupi et al. (Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, 76), and Hájek and Joyce (Reference Hájek, Joyce, Psillos and Curd2013, 122). Note that both OD and Gaifman’s measure G are undefined when p(H∣E) = 1; in this case, one can stipulate that their value is equal to +∞ (see, e.g., Brössel Reference Brössel2013, 382; Glass and McCartney Reference Glass and McCartney2014, 62 n. 4). Similar issues arise, with some of the measures introduced in the next sections, when not only p(H∣E) but also p(E∣H) and p(E∣¬H) assume extreme values (0 or 1). In order to guarantee mathematical definiteness, one can adopt an appropriate stipulation along the same lines as in the case mentioned above.

13. Considering slightly stronger variants of WLL, Fitelson (Reference Fitelson2007, 479), e.g., says that WLL is “a crucial common feature of all Bayesian conceptions of relational [i.e., relevance] confirmation,” and Joyce (Reference Joyce and Zalta2008, sec. 3) further argues that WLL must be “an integral part of any account of evidential relevance that deserves the title ‘Bayesian.’” In this connection, it is worth noting that incremental measures violating WLL have previously appeared in the literature, if only occasionally. For instance, one can check that the following continuum of incremental measures, Dx(H,E)=p(H∣E)x−p(H)x, with x greater than 1, leads to violations of WLL. Crupi et al. (Reference Crupi, Festa, Buttasi, Suárez, Dorato and Rédei2010, 91–92) and Roche and Shogenji (Reference Roche and Shogenji2014, 121) discuss, respectively, the special cases of D¹⁰ and D². As pointed out by an anonymous reviewer for this journal, WLL-N has been recently discussed (under the label of “criterion C5”) by Glass and McCartney (Reference Glass and McCartney2014, sec. 3), who also note that WLL-N is violated by at least one incremental measure, thus being a structural principle in our sense.

14. See, e.g., Bar-Hillel and Carnap (Reference Bar-Hillel and Carnap1953, 149) and Popper (Reference Popper1959, apps. 7–9) for classical appearances of imp₁. Measure imp₂ is a natural alternative, and imp₃ is simply the counterpart of imp₂ in terms of odds. These measures of improbability, and other similar measures, have been considered in many fields of philosophy of science and statistics under various labels, such as “information,” “content,” “informative content,” “uncertainty,” and “logical strength”: for a useful survey, see Crupi and Tentori (Reference Crupi and Tentori2014).

15. Note that measures ID₂ and ID₃ are new in the sense that they are not reducible to other more traditional confirmation measures. Still, essentially identical measures have occasionally appeared in the most recent literature, e.g., in Festa (Reference Pagnini2012, 93) and Roche (Reference Roche2014, 95; Reference Roche2015b, sec. 2).

16. We thank an anonymous referee for raising the objection and prompting us to discuss this point.

17. Labels MP and RMP, which stand respectively for “Matthew effect for positive confirmation [i.e., confirmation in narrow sense]” and “Reversed Matthew effect for positive confirmation” are adapted from Festa (Reference Pagnini2012, 95). As Kuipers (Reference Kuipers2000, 25) notes, principle MP “may be seen as a methodological version of the so-called Matthew effect, according to which the rich profit more than the poor” since “a more probable hypothesis profits more than a less probable one” from its successes, along the lines of the evangelical statement “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath” (Gospel of Matthew 13:12). Note that Kuipers himself would not accept MP, since he favors the idea that, if H1 and H2 are equally successful with respect to E, they should be equally confirmed by E (a condition that Festa calls “Matthew independence for positive confirmation”).

18. In this connection, it may be interesting to note that, somehow paradoxically, Popper’s own corroboration measure meets MP while violating RMP (cf. Festa Reference Pagnini2012, 97, theorem 1.ii; Roche Reference Roche2014, 99, theorem 1*.b; for discussion, see Festa and Cevolani Reference Festa and Cevolani2015).

References

Bar-Hillel, Yehoshua, and Carnap, Rudolf. 1953. “Semantic Information.” British Journal for the Philosophy of Science 4:147–57.CrossRef Google Scholar

Brössel, Peter. 2013. “The Problem of Measure Sensitivity Redux.” Philosophy of Science 80:378–97.CrossRef Google Scholar

Carnap, Rudolf. 1950/1950. Logical Foundations of Probability. Chicago: University of Chicago Press.Google Scholar

Crupi, Vincenzo. 2015. “Confirmation.” In Stanford Encyclopedia of Philosophy, ed. Zalta, Edward N.. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/sum2015/entries/confirmation/.Google Scholar

Crupi, Vincenzo, Chater, Nick, and Tentori, Katya. 2013. “New Axioms for Probability and Likelihood Ratio Measures.” British Journal for the Philosophy of Science 64:189–204.CrossRef Google Scholar

Crupi, Vincenzo, Festa, Roberto, and Buttasi, Carlo. 2010. “Towards a Grammar of Bayesian Confirmation.” In Epistemology and Methodology of Science, ed. Suárez, M., Dorato, M., and Rédei, M., 73–93. Dordrecht: Springer.Google Scholar

Crupi, Vincenzo, and Tentori, Katya. 2014. “State of the Field: Measuring Information and Confirmation.” Studies in History and Philosophy of Science A 47:81–90.CrossRef Google Scholar

Edwards, Anthony W. F. 1972. Likelihood. Cambridge: Cambridge University Press.Google Scholar

Festa, Roberto. 1999. “Bayesian Confirmation.” In Experience, Reality, and Scientific Explanation, ed. Galavotti, Maria Carla and Pagnini, Alessandro, 55–87. Dordrecht: Springer.CrossRef Google Scholar

Pagnini, Alessandro 2012. “‘For Unto Every One That Hath Shall Be Given’: Matthew Properties for Incremental Confirmation.” Synthese 184:89–100.Google Scholar

Festa, Roberto, and Cevolani, Gustavo. 2015. “Matthew Properties for Incremental Confirmation and the Weak Law of Likelihood.” Unpublished manuscript, University of Trieste and University of Turin.Google Scholar

Fitelson, Branden. 1999. “The Plurality of Bayesian Measures of Confirmation and the Problem of Measure Sensitivity.” Philosophy of Science 66:378.CrossRef Google Scholar

Fitelson, Branden 2007. “Likelihoodism, Bayesianism, and Relational Confirmation.” Synthese 156:473–89.CrossRef Google Scholar

Gaifman, Haim. 1979. “Subjective Probability, Natural Predicates and Hempel’s Ravens.” Erkenntnis 14:105–47.Google Scholar

Glass, David H., and McCartney, Mark. 2014. “A New Argument for the Likelihood Ratio Measure of Confirmation.” Acta Analytica 30:59–65.CrossRef Google Scholar

Good, I. J. 1950. Probability and the Weighing of Evidence. London: Griffin.Google Scholar

Hacking, Ian. 1965. Logic of Statistical Inference. Cambridge: Cambridge University Press.CrossRef Google Scholar

Hájek, Alan, and Joyce, James. 2013. “Confirmation.” In Routledge Companion to the Philosophy of Science, ed. Psillos, S. and Curd, M., 115–29. New York: Routledge.Google Scholar

Hildebrand, David K., Laing, James D., and Rosenthal, Howard. 1977. Analysis of Ordinal Data. Beverly Hills, CA: Sage.CrossRef Google Scholar

Iranzo, Valeriano, and de Lejarza, Ignacio Martínez. 2012. “On Ratio Measures of Confirmation.” Journal for General Philosophy of Science 44:193–200.CrossRef Google Scholar

Joyce, James. 2008. “Bayes’ Theorem.” In Stanford Encyclopedia of Philosophy, ed. Zalta, Edward N.. Stanford, CA: Stanford University. http://plato.stanford.edu/archives/fall2008/entries/bayes-theorem/.Google Scholar

Keynes, John M. 1921. A Treatise on Probability. London: Macmillan.Google Scholar

Kuipers, Theo A. F. 2000. From Instrumentalism to Constructive Realism. Dordrecht: Kluwer.CrossRef Google Scholar

Mortimer, Halina. 1988. The Logic of Induction. Chichester: Halsted.Google Scholar

Nozick, Robert. 1981. Philosophical Explanations. Cambridge, MA: Harvard University Press.Google Scholar

Popper, Karl R. 1959. The Logic of Scientific Discovery. London: Routledge.Google Scholar

Roche, William. 2014. “A Note on Confirmation and Matthew Properties.” Logic and Philosophy of Science 12:91–101.Google Scholar

Roche, William 2015a. “Confirmation, Increase in Probability, and Partial Discrimination: A Reply to Zalabardo.” European Journal for Philosophy of Science 6:1–7.CrossRef Google Scholar

Roche, William 2015b. “Evidential Support, Transitivity, and Screening-Off.” Review of Symbolic Logic 8:785–806.CrossRef Google Scholar

Roche, William, and Shogenji, Tomoji. 2014. “Dwindling Confirmation.” Philosophy of Science 81:114–37.CrossRef Google Scholar

Royall, Richard. 1997. Statistical Evidence: A Likelihood Paradigm. London: Chapman & Hall.Google Scholar

Salmon, Wesley C. 1975. “Confirmation and Relevance.” Minnesota Studies in the Philosophy of Science 6:3–36.Google Scholar

Sober, Elliott. 1990. “Contrastive Empiricism.” In Scientific Theories, ed. Savage, C. Wade, 392–410. Minneapolis: University of Minnesota Press.Google Scholar

Savage, C. Wade 2008. Evidence and Evolution: The Logic Behind the Science. Cambridge: Cambridge University Press.Google Scholar

Zalabardo, José. 2009. “An Argument for the Likelihood-Ratio Measure of Confirmation.” Analysis 69:630–35.CrossRef Google Scholar

Article contents

Unfolding the Grammar of Bayesian Confirmation: Likelihood and Antilikelihood Principles

Abstract

1. Introduction

2. Incremental Measures of Bayesian Confirmation

2.1. Qualitative Confirmation

2.2. The Grammar of Incremental Measures of Confirmation

3. Strong and Weak Likelihood Principles for Bayesian Confirmation

4. Antilikelihood Principles for Bayesian Confirmation

4.1. Confirmation as Reduction of Improbability

4.2. The Normalized Difference Measure

5. Concluding Remarks

Appendix Proofs

Theorem 1

Theorem 2

Theorem 3

Theorem 4

Theorem 5

Theorem 6

Theorem 7

Theorem 8

Theorem 9

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests