The classic modern statement of inductive skepticism comes from David Hume, although he reminds us of its ancient sources. Inductive reasoning is not justified by relations of ideas: “That the sun will not rise tomorrow is no less intelligible a proposition than that it will rise. We should in vain, therefore, attempt to demonstrate its falsehood” (Hume Reference Hume1748/1777, Sec. IV, Pt. I, 21). To try to justify it inductively is to beg the question: “It is impossible, therefore, that any arguments from experience can prove this resemblance of the past to the future, since all these arguments are founded on the supposition of that resemblance” (Sec. IV, Pt. II, 32).
One might simply read Hume and give up. This is the position taken by Karl Popper in the early twentieth century. “Yet if we want to find a way of justifying inductive inferences, we must first try to establish a principle of induction. … Now this principle of induction cannot be a purely logical truth … if we try to regard its truth as known from experience, then the very same problems which occasioned its introduction will arise all over again.” Hume, according to Popper, has shown the impossibility of inductive logic: “My own view is that the various difficulties of inductive logic here sketched are insurmountable” (Popper Reference Popper1934/1968, 29), and Popper concludes that the logic of scientific inquiry must be purely deductive.
One might well ask why Popper did not think to apply the tropes of Agrippa the skeptic to deductive reasoning, just as Hume had applied them to inductive reasoning.Footnote 1 There is an infinite regress—mathematics is justified by set theory, whose consistency is proved by stronger set theory, and so on—or, regarding mathematics as a whole, a circularity. Or why not ask why one should accept any argument at all? Trying to answer a thoroughgoing skeptic is a fool’s game.
But it is possible, and sometimes quite reasonable, to be skeptical about some things but not others. There are grades of inductive skepticism, which differ in what the skeptic calls into question and what he is willing to accept. For each grade, a discussion of whether such a skeptic’s doubts are justified in his own terms might actually be worthwhile. Here, I assume that my skeptics are probabilistically coherent.Footnote 2
Hume floated a global skepticism, but there is a much more focused skeptical challenge that he might well have raised, were he a better mathematician and more in touch with the current ferment in probability theory. The challenge was raised retrospectively by Richard Price. Pascal and Fermat had their correspondence (1654), which rapidly became known to the intellectual elite of Europe.Footnote 3 Christaan Huygens had written a monograph on probability (1656), which was translated into Latin (Reference Huygens and van Schooten1656/1657) and then twice into English (Reference van Schooten and Arbuthnot1656/1692, Reference Arbuthnot and Browne1656/1714). Jacob Bernoulli’s Ars Conjectandi, including the text of Huygens, had been published posthumously (Reference Bernoulli1713/2005). The first edition of de Moivre’s The Doctrine of Chances had been published (Reference de Moivre1718). There was intense interest in practical applications of the new theory of probability. In particular, there was interest in arguing not only from known chances in gaming to probabilities of outcomes but also in the inference from data to chances.
Bernoulli claimed that he had cracked this problem: “what you cannot deduce a priori, you can at least deduce a posteriori—that is, you will be able to make a deduction from the many observed outcomes of similar events. For it may be presumed that every single thing is able to happen and not to happen in as many cases as it was previously observed to have happened or not to have happened in like circumstances” (Reference Bernoulli1713/2005, chap. 4; de Moivre made similar claims in the second and third editions of The Doctrine of Chances). Bernoulli had proved that with enough trials it would become “morally certain” that the frequency would be approximately equal to the true chance. If x is approximately equal to y, then y is approximately equal to x. So, after a large number of trials, we can take the true chances to be approximately equal to the observed frequencies.
This informal argument gains an air of plausibility by concealing difficulties behind a cloak of moral certainty and approximate equality. It is no proof, as is apparent if the argument is stated carefully. As Richard Price put it:
Mr. De Moivre … has, after Bernoulli, and to a greater degree of exactness, given rules to find the probability there is, that if a very great number of trials be made concerning any event, the proportion of the number of times it will happen to the number of times it will fail, in those trials, should differ less by small assigned limits from the proportion of the probability of its happening to the probability of its failing in one single trial. But I know of no person who has shown how to deduce the solution to the converse problem to this; namely, “the number of times an unknown event has happened and failed being given, to find the chance that the probability of its happening should lie somewhere between two named degrees of probability.” (Reference Price1763, 372–73)
In the section of the Inquiry devoted to probability, David Hume wrote:
But where different effects have been found to follow from causes, which are to appearance exactly similar, all these various effects must occur to the mind in transferring the past to the future, and enter into our consideration, when we determine the probability of the event. Though we give the preference to that which has been found most usual, and believe that this effect will exist, we must not overlook the other effects, but must assign to each of them a particular weight and authority, in proportion as we have found it to be more or less frequent. … Let any one try to account for this operation of the mind upon any of the received systems of philosophy, and he will be sensible of the difficulty. For my part, I shall think it sufficient, if the present hints excite the curiosity of philosophers, and make them sensible how defective all common theories are in treating of such curious and such sublime subjects. (Reference Hume1748/1777, Sec. VI, 47)
Neither Bernoulli nor de Moivre had given an answer to Hume.
An answer was supplied by Thomas Bayes, in an essay that Price is introducing. It was written around 1749 (see Zabell Reference Zabell1989) but only published posthumously (Bayes Reference Bayes1763) in the Philosophical Transactions of the Royal Society. Bayes’s target, at least as seen by Price, was general Humean skepticism about inductive inference.Footnote 4 This is most evident in the title page affixed to reprints of Bayes’s essay, which reads: “A Method of Calculating the Exact Probability of All Conclusions based on Induction” (Stigler Reference Stigler2013, 283). Bayes’s goal was to calculate the probability that the true chance of a dichotomous event fell within a certain interval, given frequencies in a finite number of trials. For the question to make sense at all, chance must be a random variable—there must be a probability distribution over the possible chancesFootnote 5—something missing in Bernoulli and de Moivre. This raises the question of what to take as the probability distribution over the chances before any trials—the question of the proper quantification of ignorance.
Bayes assumed the uniform prior, in which intervals of equal length are given equal probability of containing the true chance. On this basis, he shows by a clever geometrical argument that on this assumption the probability of m successes in n trials is 1/(n + 1), for any m.Footnote 6 In a scholium Bayes remarks that this result itself might be taken as a proper quantification of ignorance—each number of successes has equal probability. This is ignorance about observables—and in particular about frequencies.Footnote 7
Bayes’s analysis was extended (perhaps independently) by Laplace in a remarkable essay of 1774.Footnote 8 Assuming the uniform prior, Laplace proves his famous rule of succession. Given p successes in p + q trials, the probability of a success in the next trial is

More generally, he considered the predictive distribution, for m successes in m + n additional trials given the evidence of p successes in p + q trials. Laplace shows that if the datum, p + q, is large and m + n is small, the result is close to taking the observed frequency as giving the chances. But he also feels bound to point out that this is not the case if the number of trials predicted is also large: “and it seems to me essential to note this.”Footnote 9 What is more, Laplace showed what is now called Bayesian consistency:Footnote 10 “One can suppose that the numbers p and q are so large that it becomes as close to certainty as one wishes that the ratio of the number of white tickets to the total number of tickets contained in the urn is included between the two limits p/(p+q−w) and p/(p+q+w). w can be supposed less than any given quantity” (Laplace Reference Laplace1774/1986, 366). The Bayes-Laplace inference converges to the true chances.
Given their assumptions, Bayes and Laplace show that Bernoulli’s conclusion was right. We can infer the approximate chances a posteriori. And, in this setting, they do give an answer to Hume. They show when, and in what sense, it is rational to believe that the future is like the past.
A more radical skeptic will not be at a loss to find assumptions to question in the Bayes-Laplace analysis. Bayes himself found it necessary to buttress the assumption of the uniform prior with the argument of the scholium. Throughout the history of Bayesian inference, some have thought it necessary to defend a unique quantification of ignorance.
I think that this is a mistake. Ignorance is the opposite of knowledge. So an ignorance prior should be a prior that does not presume knowledge. I might know the composition of the urn or the bias of the coin exactly. I might know less and still know something. I might know that the urn contains more black tickets than white or that the bias toward heads is greater than 1/2. But suppose that I know no such thing.
Then my ignorance prior should put some positive probability on the true chance being in any open interval between 0 and 1. Specification of an ignorance prior is not unique. There are lots of them. If you do not like calling these ignorance priors on the ground that they may be sharply peaked, call them nondogmatic priors or skeptical priors, because these priors are quite in the spirit of ancient skepticism.
What Laplace showed for the uniform prior holds for all skeptical priors. Given enough experience, priors lead a Bayesian to predict tomorrow using something close to the observed frequency. With chance one, priors lead the Bayesian to converge to the true chances.Footnote 11 Skeptical priors defeat skepticism.
What about the dogmatist? Suppose, for example, that this person is convinced that the bias toward heads is greater than 1/2 and has a uniform prior on it being between 1/2 and 1. If the true bias is 1/4, he will never learn it. Nevertheless, he believes that he will learn the true chances, because he is sure that they are between 1/2 and 1. You may not believe that he will learn the true chances, but he does. This holds quite generally (Doob Reference Doob1948). With his degree-of-belief one, he will converge to the true chances. Neither the dogmatic prior nor the open-minded prior is consistent with inductive skepticism
* * *
The foregoing all takes place within a specific chance model. Perhaps, with Hume, we may believe that “there is no such thing as chance in the world” (Hume Reference Hume1748/1777, Sec. VI, 46).
Suppose there is a potentially infinite sequence of yes-no events. And suppose that you are a frequentist in the following weak sense: for you, the only thing that matters for the probability of a finite outcome sequence of a given length is the relative frequency of successes in that sequence. That is to say that for you, two sequences of the same length having the same relative frequencies have the same probability. Then Bruno de Finetti proves that you behave like Bayes, with his chance model and some (not necessarily flat) prior. Furthermore, the prior is uniquely determined by your degrees of belief over the outcome sequences.Footnote 12
De Finetti, like Hume, believes that there is no such thing as chance in the world and shows that we can have the virtues of Bayes’s analysis without the baggage. If you are skeptical about the existence of chances, the chance model, and the prior over the chances, de Finetti shows how to get them all from your degrees of belief, provided that they satisfy the foregoing condition of exchangeability. Furthermore, you must believe with probability 1 that a limiting relative frequency exists and that with repeated experience you will converge to it.Footnote 13 If your degrees of belief are exchangeable, you cannot be an inductive skeptic.
What if your degrees of belief are not exchangeable? Short of exchangeability there may be other symmetries that have inductive consequences. De Finetti himself initiated this line of thought in 1938 (de Finetti Reference de Finetti1938/1980), and there are many subsequent developments (see Diaconis and Freedman Reference Diaconis and Freedman1980b). We consider the problem at a very general, abstract level.
We suppose that you have a measurable space that encapsulates the problem that you are thinking about.Footnote 14 You bring to this problem your degrees of belief: a probability measure on the measurable sets that is invariant under some measurable transformation T (or group of transformations) of the space into itself. Transformation T represents your conception of a repetition of an experiment.Footnote 15 Invariance means that the transformation (or group of transformations) leaves the probabilistic structure unchanged.
For example, suppose the points in the probability space are doubly infinite sequences of experimental outcomes, indexed by discrete time.Footnote 16 If your probability is invariant under the shift transformation, that means that the probabilistic structure is not affected by the passage of time: that is to say that the stochastic process is stationary.
Fixing the transformation, the set of invariant probability measures is convex. Your degrees of belief are one member of this set. The extreme points of this set are probabilities that are, in a certain sense, resilient under conditioning on the invariant sets.Footnote 17 In these extreme probabilities, invariant sets have probability 1 or 0. (The measure cannot be decomposed into two or more invariant measures by conditioning on invariant sets.) These are the ergodic probability measures.
Starting at some point, x, in your probability space, you contemplate a series of experiments, x, Tx, TTx, … Tnx. You keep track of the relative frequency of the points being in some measurable set, A.Footnote 18 Given the foregoing, you believe with probability 1 that the limiting relative frequency will exist.Footnote 19 In this sense, you cannot be an inductive skeptic. You cannot be a skeptic in the sense of Reichenbach. This is a consequence of invariance.Footnote 20
The limiting relative frequency of A is a random variable. Your expectation of this limiting relative frequency is your probability of A. In the special case in which your degrees of belief are ergodic, you are sure that your probability of A is equal to the limiting relative frequency. This is Birkhoff’s ergodic theorem.
The foregoing has all been at such an abstract level that it is impossible to say much about the extremal ergodic measures. Further specification of the transformations under which your probabilities are invariant can give more information. De Finetti’s theorem is a special case with the invariant measures being the exchangeable ones and the ergodic ones being independent and identically distributed. A version of de Finetti’s theorem for Markov chains, proved by David Freedman, is another (Freedman Reference Freedman1962). As emphasized at the onset, your probabilities and your conception of repetition of the same experiment are up to you. You and I may differ. We may be skeptical about each other but not about ourselves
So far, the envisioned learning experiences have been modeled as conditioning on the evidence. A more radical skeptic may well call this into question. This is the stance taken in Richard Jeffrey’s Radical Probabilism (Jeffrey Reference Jeffrey1965, Reference Jeffrey and Lakatos1968). Must a radical probabilist perforce be a radical inductive skeptic?
One learns through some sort of black-box interaction that updates one’s probabilities. Then we need some way of distinguishing interactions that are viewed as learning experiences and those that are viewed as mindworms, brainwashing, drug-induced hallucinations, the Sirens singing to Ulysses, and so on. A plausible candidate is diachronic coherence (Goldstein Reference Goldstein1983; van Fraassen Reference van Fraassen1984).
If one contemplates a sequence of such experiences stretching off into the future and regards them as learning experiences, coherence requires that they form a martingale in your degrees of belief, as I have previously pointed out (Skyrms Reference Skyrms1990, 2006). That means that the martingale convergence theorem comes into play. As an added twist, one who is skeptical of countable additivity need not worry. This can all be done with finitely additive martingales (Zabell Reference Earman2002). Even in this austere setting, one cannot be a complete inductive skeptic.
Hume remarks that it is psychologically impossible to be a consistent skeptic: “since reason is incapable of dispelling these clouds, nature herself suffices to that purpose” (Hume Reference Hume and Selby-Bigge1896, Bk. I, Pt. IV, Sec. VII). One is not logically compelled to believe in a prospective sequence of learning experiments. One may not be coherent or believe that one will remain coherent in the future. One need not believe that there will be a future. Absolute skepticism is unanswerable.
But short of absolute skepticism, there are various grades of inductive skepticism, differing in what the skeptic brings to the table and what he calls into doubt. Some kinds of skeptics may call into question things to which they are implicitly committed. In such a case, reason is capable of dispelling doubts. It is remarkable the extent to which the logic of coherent belief itself constrains inductive skepticism.