Hostname: page-component-6bf8c574d5-xtvcr Total loading time: 0 Render date: 2025-02-20T23:43:07.691Z Has data issue: false hasContentIssue false

A Justification of the Probabilistic Explanation of the Entropy Principle

Published online by Cambridge University Press:  01 January 2022

Rights & Permissions [Opens in a new window]

Abstract

In many ways, entropy and probability are two concepts that complement each other. But it has been argued that there is no ‘straightforward connection’ between them with a no-go thesis from Kevin Davey. However, this skeptical conclusion rests on counterexamples that fail to do justice to the entropy principle and the equivocality of the notion of probability. Proceeding from the disambiguation of probability, and acknowledging the explanatory goal of the entropy principle, it is argued that the Boltzmannian statistical mechanics account can be vindicated with a justification of the explanandum, the reference class, and the standard uniform probability measure.

Type
Research Article
Copyright
Copyright © The Philosophy of Science Association

1. Skepticism about Probability Measures in Statistical Mechanics

The thesis that entropy increases because systems pass from less probable states to more probable states is often presented as a successful explanation of thermodynamic phenomena by reductionist statistical physics. But this thesis is more insidious than it seems: it does not mean that entropy cannot decrease or that entropic systems are chancy or that thermodynamics is indeterministic. Additionally, it is compatible with the hypothesis that many or even most thermodynamic systems are in a low-entropy state right now. Consequently, the elucidation of this important thesis in physics requires an adequate interpretation of these two polysemous and polemical concepts: entropy and probability.

This insidiousness has been pushed to its extreme by a no-go thesis in Davey (Reference Davey2008), which argues against the existence of a straightforward, a priori connection between entropy and probability. Davey's argument rests on the claim that any “probability principle”? whereby knowledge of a system's macrostate M is used to determine the probability measure μM over the microstates cannot be justified. Davey devises situations showing that it is possible to know M without being in a position to form a justified belief about which probability measure is applicable to M. Therefore, the low-entropy states actually found in nature would need not be considered improbable at all because low-entropy states are not necessarily improbable. In which case neither the second law of thermodynamics nor the entropy principle (defined below) would be explicable as a result of the fact that systems tend to move from improbable to probable states. Davey's position thus targets the reference class problem in Boltzmannian statistical mechanics (BSM), and his thesis has the merit to force a (better) justification of the basic assumptions at play regarding that problem. Davey's argument is apropos since the choice of a probability distribution is arguably “the fundamental problem of statistical mechanics”? (Penrose Reference Penrose1979, 1940).

As we will see below, Davey's skepticism can be subsumed under a much broader skepticism, similar in kind to the problem of induction, which amounts to denying the ascription of any probability value or measure whatsoever. It thus must be rejected. But my main aim in this article is to give (back) to BSM some credit in explaining the tendency of thermodynamic behavior toward equilibrium. I thus want to give a justification for the use that BSM makes of the Lebesgue measure (or an equivalent measure) on the set of possible microstates compatible with a given system's macrostate (as opposed to a reference to other classes the system in question belongs to and other measures describing the class in question). Since the main goal of the probabilistic explanatory strategy advanced in this article is to explain thermodynamic entropy increase, determining the correct reasons for choosing a particular reference class and probability measure are to be sought in this very explanatory goal. The argument for a justification of the probabilistic explanatory power of BSM rests on meeting the following three criteria: (i) the explanandum (i.e., isolated thermodynamic systems), (ii) the reference class (i.e., the set of possible microstates), and (iii) the standard uniform probability measure are indeed justified. This article addresses the justification of these criteria explicitly.

The next section (sec. 2) thereby addresses what kind of phenomenon the thermodynamic entropy principle aims at explaining (i.e., its explanandum) and, correspondingly, the explanandum of the statistical version of the entropy principle. Section 3 first addresses Davey's skeptical position and, second, distinguishes several different probability ascription procedures (diachronic and intra- and interlevel synchronic). Distinguishing these procedures reveals the shifts from one procedure to another in Davey's argument (which dissociates the procedure for probability ascription from the one for entropy value). The distinction also allows us to identify which probability ascription procedure is at play in BSM. In section 4, I present the entropy principle's explanatory strategy by addressing the aforementioned criteria and thereby offering a justification for the different elements of the probabilistic account.

2. The Entropy Principle and References Classes

The concept of entropy shows an extraordinary polysemy (see, e.g., Capek and Sheehan Reference Capek and Sheehan2005; Frigg and Werndl Reference Frigg, Werndl, Beisbart and Hartmann2011) and has created many controversies (e.g., Carnap Reference Carnap1977; Callender Reference Callender1999; Davey Reference Davey2008). Because of the problems created when various definitions of a concept are equivocated, it is prudent to first distinguish the second law of thermodynamics, which stipulates that heat cannot be entirely converted into work, from the entropy principle, stipulating that entropy does not decrease in isolated systems (expanded on below). The latter is often seen as a reformulation or even an explanation of the former. In thermal physics, one must also distinguish the thermodynamic from the statistical version of the entropy principle, and in statistical mechanics (SM), there are also quite a few definitions: Boltzmann's and Gibbs's, based on complexions or phase space volume and on surface or volume integrals (see Uffink [Reference Uffink, Butterfield and Earman2007] and the references therein). Here (following Davey), the discussion will be restricted to the domain of BSM, and thus it relies on the Boltzmannian definition of entropy.

2.1. BSM Explanatory Objectives

Since the Boltzmannian definition of entropy aims at accounting for the thermodynamic version of the entropy principle, it is worth stating again that the latter stipulates that the entropy of a system is always nondecreasing (i.e., it increases or remains constant) for any adiabatic process (whether the system is isolated or not). But there are certain circumstances in which systems can have decreasing entropy: labeled as ‘closed’ (those systems that can exchange energy but not matter) and ‘open’ (those exchanging both energy and matter). It is also possible to prepare a low-entropy state, and additionally, a decrease in the entropy of a system is sufficient to infer an interaction of that system with its environment. The entropy principle thus reflects an asymmetry between two classes of systems, between isolated systems (whether adiabatically isolated or not), on the one hand, and closed or open systems, on the other (Clausius Reference Clausius1854; Denbigh Reference Denbigh1989). (Lieb and Yngvason [Reference Lieb and Yngvason2003] base this distinction on adiabatic accessibility.) This asymmetry rests on the existence of certain “spontaneous changes”? that involve an entropy increase.Footnote 1 In fact, an open system can always be part of a larger isolated system so that the entropy of the isolated ‘system + environment’ always increases or remains constant; that is, a local entropy decrease of a system must be compensated by a global entropy increase in its environment. Consequently, the asymmetry is between (A) always nondecreasing entropy for isolated systems or adiabatically isolated systems and (B) increasing or decreasing entropy for closed or open systems. The probabilistic explanatory strategy of the BSM account aims at explaining only A, which is thus its explanandum.

This strategy takes the Boltzmannian entropy of a macrostate Mt at time t to be SB(t)SB(Mt)kBlog [μ(ΓM)], where k B is the Boltzmann constant, μ is the (normalized) Lebesgue measure, and ΓM is a region of phase space associated with Mt (Landau and Lifshitz Reference Landau and Lifshitz1980; Goldstein and Lebowitz Reference Goldstein and Lebowitz2004; Uffink Reference Uffink, Butterfield and Earman2007). The leading idea of BSM's entropy principle is that S B(Mt) could mirror the behavior of the thermodynamic entropy, that is, increasing with time and reaching a maximum at equilibrium.Footnote 2 This principle is as follows (Brown et al. Reference Brown, Myrvold and Uffink2009; Frigg Reference Frigg, Ernst and Hütteman2010a): consider an instant of time t 0 and assume that the entropy S B(t 0) of a system is low compared to S B at equilibrium; then, for any time t>t0, it is highly probable that SB(t)>SB(t0). The challenge then is to establish a clear meaning to the phrase ‘highly probable’. Davey has succeeded in demonstrating its inherent ambiguity by challenging the justification of a system's reference classes.

2.2. Davey's Counterexamples

What Davey's (Reference Davey2008) paper claims is that one can construct specific cases in which an instance of the entropy principle should not be construed as a move from an improbable state to a probable state, in accordance with what the BSM explanation would require, but rather as a move from a probable state to an improbable state. In other words, Davey's point is that one can imagine an allegedly appropriate reference class/measure space with respect to which an instance of the entropy principle (i.e., a process with increasing Boltzmann entropy) is unlikely rather than likely, contrary to what the BSM account would tell us. He thus presents three different counterexamples that target the three aforementioned criteria required to vindicate the BSM account.

The first counterexample presents the case of a scientist who tells us that 10 minutes ago a gas in a glass box was in equilibrium, and at that time he made a choice between two preparations. The upshot is that there is no way we could form a justified belief about the probability of either of those two preparations. However, since the BSM account aims at explaining A (as discussed above), the question to be answered is ‘why does the entropy of isolated systems increase?’ rather than ‘why are there many low entropy systems?’ or ‘is it probable to find low entropy systems?’ (see sec. 3.2).

The second counterexample presents a counterfactual argument about glasses under ‘government fiat’: in one case, all good citizens must do everything they can to keep all glasses of water half full of ice, and in another case all good citizens must do everything they can to keep all glasses free of ice. The argument thus targets the claim that low-entropy states actually found in nature need not be improbable at all because we can imagine situations in which systems in states of low entropy ought not to be ascribed states of low probability.Footnote 3 But this argument misrepresents the explanandum of the BSM account by conflating different probability ascription procedures (see sec. 3.2) and by disassociating the reference classes for probability ascription, on the one hand, and entropy value, on the other (sec. 4.2).

The third counterexample devises a particular microstate xM that never passes through the equilibrium macrostate, Q, and a special probability measure that is defined only on x. It thus targets the choice and justification of a “micro”? reference class (discussed in sec. 4.2) and of a probability measure (discussed in sec. 4.3). I now turn to the discussion about a disambiguation of probability, and a related albeit misguided skepticism, before showing that those three counterexamples fail to do justice to the BSM account of the entropy principle (in sec. 4).

3. Probability

3.1. The Probability Ascription Problem

As von Mises emphasized, probability in, probability out. This means roughly that a probabilistic assertion must be posited in some way. In other words, one should not conflate facts and probabilities. Relative frequencies and symmetry attributes are not probabilities—they are facts (Sklar Reference Sklar1993; Handfield Reference Handfield2012). And facts do not come in a “preformed sigma-algebra of events”?; they are ‘exogenous’ to the probability calculus.Footnote 4 The probability calculus is ‘objective’ because it is formally constructed, but the ascription of a probability value or a probability measure is partially arbitrary, hence ‘subjective’, because it lacks this formal guidance.Footnote 5 The inevitable presence of arbitrary components, such as those related to the reference class problem, in inferences leading to the ascription of probabilities to facts can be labeled the probability ascription problem.

This problem is similar in nature to the problem of induction because both are cases of ampliative inferences (see Carnap Reference Carnap1966; Hájek and Hall Reference Hájek, Hall, Machamer and Silberstein2002). This is a standard argument saying that any terms in the conclusion that are not already in the premises must have been introduced via definition, and the definitions must be in terms of either what is available in the premises or what has been previously defined in the course of the reasoning. Yet, both inductive inferences and inferences to probability ascription go ‘beyond the evidence’, and in that sense both can nourish a certain skepticism. Indeed, a statement inferring from the finite frequencies of a coin toss the probabilities of heads and tails ‘goes beyond the evidence’ of those finite frequencies because these probabilities imply that subsequent ‘similar enough’ tosses will abide by the same frequencies.

This problem might give some support to Davey's skepticism and his claim that the connection between entropy and probability is not ‘straightforward’. In this particular sense, Davey's thesis would be a special case of the probability ascription problem, for BSM has no more special status in this regard than any other (empirical) scientific theory. On this account, it would be correct to say that the BSM probabilistic explanatory strategy is problematic. However, this is too strong a requirement because the probability ascription problem would hinder any kind of probabilistic ascription or probabilistic explanation in cases not supported on inductive grounds, such as cases based on asymmetries (e.g., a die before throwing it). This is therefore the same conclusion regarding Davey's skepticism as that of Shech (Reference Shech2013) but for other reasons. But Davey's argument points to a version of the reference class problem that differs from the aforementioned probability ascription problem. Davey's account makes the insight that there can be situations in which an a priori principle chooses the wrong probability measure. My argument for the justification of an adequate probabilistic interpretation relies on a distinction between probability ascription procedures, which serves to disambiguate what is deemed ‘probable’ and thereby aligns with the explanatory goals of the BSM account. This allows one to construct and then justify both the explanandum and the explanans of the BSM entropy principle.

3.2. Two Kinds of Probability Ascription Procedure

I propose a distinction between two generic procedures to ascribe probability values, which can be labeled “diachronic”? and “synchronic.”?Footnote 6 They are distinguished here to show how they can influence differently the meaning of probability and, thus, its equivocality. Consequently, it affects the meaning of the theory in which probabilities play a role. For instance, it seems justified to assert that probability has not the same meaning in quantum mechanics that it has in evolutionary biology, or at least that the convergence of their meaning requires a justification. They both depend on the evidence of chance, but, as the previous discussion emphasized, facts alone are insufficient for probability ascription.Footnote 7 Accordingly, this account can be considered as ‘neutral’ (or ‘hybrid’) with regard to the strands of interpretation, that is, the epistemic and the physical. Such a choice would be superimposed and should not hinder the role of probability in explanation.Footnote 8

The simple experiment of throwing dice can be helpful here. In the diachronic procedure, the outcomes (i.e., the mutually exclusive results) are instantiated after throwing a die many times in a similar fashion. If the number of throws is large enough, it is supposed that the set of all possible outcomes will be instantiated, which corresponds to the ‘basic space’, Ω.Footnote 9 Additionally, it is often supposed that the relative frequency of the outcomes just equals their probability. However, as discussed earlier, this assumption is generally only adopted faute de mieux. The procedure can be formalized as follows:

Diachronic probability procedure. ED:aiΩ={(a1=TtE1),(a2=TtE2),”?,(an=TtEn)},

where ai are the outcomes among n possibilities in Ω, E is an experiment (or a test) and Ei are particular instances of the generic ED (i.e., the system in a particular context at a certain time, so a particular occurrence and not the experiment per se), and Tt is an evolution operator.Footnote 10 The diachronic probability is then PD(ai)=μ(ai)/μ(Ω), where μ is a normed measure. Here, a given state Ei in the experiment evolves toward a range of possible states or outcomes ai. For instance, a given state Ei of throwing a die (generally loosely defined) evolves (via Tt) toward a certain outcome ai; if the experiment or the test is performed an infinite number of times, then it is reasonable to think that all n possible outcomes occurred, which form the set Ω={1,2,”?,6}.

In the synchronic procedure, the events correspond to states of ‘similar enough’ systems (e.g., those in the same context or the same time frame); for instance, many similar dice can be thrown simultaneously (instead of the same die thrown many times in succession). Formally:

Synchronic probability procedure. ES:aiΩ={(a1B),(a2B),”?,(anB)},

where B is a class of similar systems with different attributes corresponding to the outcomes. Similarly as before, if the class B is very large, then it is reasonable to think that all n possible outcomes can be found in B. This procedure can be labeled as intralevel when the states ai belong to different systems in the same class B of systems (e.g., many similar dice) and interlevel when a macrostate (corresponding to B) is associated with a number of possible microstates corresponding to the ai. Of course, the paradigmatic example of the later is SM, where the ‘probability of a macrostate’ is based on the number of associated microstates. These procedures are very helpful when iterative procedures such as diachronic procedures are not possible; for instance, we cannot test the very same car many times to determine the (diachronic) probability of accidents, but we can test different ‘similar enough’ cars (synchronic probability). If the microstates are unobservable, then the probability measure will surely (but not necessarily) be determined by symmetry arguments, such as the equiprobability of every microstate. Thus, the same arbitrary elements, as in the diachronic procedure, are present here, but they generally lead to different philosophical problems.

Distinguishing these procedures is a good way to analyze the problems related to the ambiguous notion of probability. In effect, the objective-subjective distinction and the related reference problem do not apply in the same way in each procedure. Moreover, since BSM uses interlevel synchronic probability, and quantum mechanics diachronic probability, one can argue that certain procedures challenge the deterministic stance while others do not. Finally, it forces us to clarify the conditions in which these procedures are applied. For instance, it would appear unreasonable to assert that the (diachronic) probability of rolling a one is 1/3 because a given set of dice at a certain moment in time is constituted by one-third (synchronic probability) of dice with a one uppermost. The obvious reason for the divergence in this example appears to be that the conditions or the experimental setups are different. The government fiat example is a case in point: the intralevel synchronic probability, consisting of the relative frequency of all the glasses of water containing ice at a moment in time, differs not only from the diachronic probability of a single glass of water into which one citizen randomly puts ice but also from the interlevel synchronic probability of a single glass of water based on the phase space volume of its associated microstates.

Consequently, the probability determined by one procedure does not necessarily have the same meaning or the same consequence for an explanatory strategy as does another procedure. Assuming their convergence thus requires justification. In the absence of such a justification, as is the case in Davey's argument, a conclusion about ‘probability’ in SM rests on an ambiguity that must be lifted in order to substantiate, for example, the claim that there is no connection between entropy and probability. The next section will discuss how the BSM explanatory strategy constrains the choice of a reference class.

4. The Entropy Principle's Probabilistic Explanatory Strategy

Contrary to widespread belief, the statistical version of the entropy principle does not really stipulate that thermodynamic systems pass from improbable states to more probable states. States do not evolve into other states just because there are more of the latter or because they make up a set of a larger measure (Uffink Reference Uffink, Butterfield and Earman2007, 979”?80). Oddly enough, this would be incompatible with the deterministic laws on which thermodynamics is based. As stressed by Leeds (Reference Leeds2003, 128), speaking of “the probability of our ice melting”? is “a little misleading.”? What is extremely probable is that a piece of ice is in one of the microstates that guarantee deterministically that it will melt. Underlying this reasoning is the fundamental presupposition of SM that “systems can legitimately be described by an underlying causal picture of the world”? (Sklar Reference Sklar1993, 148). In effect, BSM posits that a system's behavior is determined by or “supervenes”? on this causal structure (Frigg Reference Frigg, Ernst and Hütteman2010a, 93).

This is an important assumption in the BSM's explanatory strategy. The following three subsections are devoted to presenting and justifying the elements of this strategy.

4.1. Probabilistic Explanation

The BSM account is supposed to explain (i.e., to offer an explanans for) a particular explanandum, namely, isolated macroscopic systems that have nondecreasing entropy (A). Obviously, the success of this account will be judged in accordance with what one sees fit as an explanation, and there can be numerous, more or less stringent models of explanation. But if the satisfaction of some formal conditions and presuppositions is necessary for any probabilistic explanation, then it is only fitting that the explanans of an account such as BSM's should abide by these conditions. Alternatively, if there are no such conditions, then scientific explanatory strategies must be taken at face value. In that case, counterexamples such as Davey's would be open to criticism for not abiding by the specific explanatory strategy that they target. The demonstration of a discrepancy between what BSM actually does and an alleged counterexample would then suffice to vindicate the former. So, let us suppose that such formal conditions exist.

The overall picture of a probabilistic explanation looks like this (Sklar Reference Sklar1973): let S be a statistical generalization (and eventually a probabilistic law). If the conditions C i it describes are met, then, with probability p, the event described by the explanandum will occur. These conditions refer to the background of S and include information about laws or causal relationships or histories. Although the details are still debated, it is generally agreed that p must be generated by S and that it must be relatively high (without the imposition of any precise value). More importantly, it is generally accepted that the ‘expectation’ of the explanandum from the explanans {S, Ci, p} has explanatory power. Conversely, the choice of a reference class and conditions forming {S, Ci, p} are supposed to support this expectation in order for this probabilistic strategy to be explanatory. Yet, it does not eliminate the possibility of a stronger or better explanation.

The choice and justification of the appropriate reference class is notoriously difficult, but the previous distinction regarding probability procedures helps to clarify the implications of those procedures and to justify a choice. For instance, a die purportedly has a 1/6 probability of yielding a one as per the diachronic probability, but that does not mean that one-sixth of the class of similar dice in the world at this very moment is in the state of yielding a one according to the (intralevel) synchronic probability. In practice, there are some presuppositions, which determine the conditions Ci (on which probabilistic explanations, regarding, e.g., the rolling of dice, are based), and these presuppositions constrain the choice of the relevant reference class. Furthermore, once the explanandum is defined (i.e., case A) as well as the conditions Ci and the presuppositions of the explanatory strategy (i.e., the ‘causal structure’), then justification of the probabilistic strategy can be obtained.

4.2. The Entropy Principle's Explanans

If the explanandum of the BSM account is (as mentioned) an isolated thermodynamic system (A), then there is no need to demand that BSM make any sort of claims about the conditions that bring about other systems (B).Footnote 11 Now, the explanans of the BSM account is based on the presupposition that there is, underlying the explanandum, a causal picture of the world. The Ci of the explanans should therefore refer to this underlying causal picture, that is, to the phase space of individual systems left isolated. Similarly, S should refer to this phase space in order to express the idea behind the probabilistic explanans (as discussed above) where it is extremely probable (i.e., very high p) that a macroscopic system (e.g., a piece of ice) is in one of the microstates that deterministically guarantee that it will later instantiate a high-entropy state (e.g., by melting).

Goldstein (Reference Goldstein and Bricmont2001, 43) expresses this idea, which he labeled ‘typicality’: given a nonequilibrium phase point x of energy E, the Hamiltonian dynamics governing the motion xt arising from x would have to be ridiculously special to avoid carrying xt into Γeq reasonably quickly and keeping it there an extremely long time, unless, of course, x itself were ridiculously special.Footnote 12 Similarly, Railton (Reference Railton1981) contends that almost all possible sets of initial conditions fall into the class leading to equilibrium, and almost none into the class leading to nonequilibrium. In short, the ratio μ(Γeq)/μ(ΓM), where M∗ is a standard nonequilibrium macrostate, is very large. Explanandum and explanans can thus constrain the choice of a reference class. The reference class that the BSM account adopts is the one containing all of the microstates (not some particular subregions) compatible with the individual system's energy. Although, at least for now, the uniform probability measure μ and correlatively the high-p clause are still in need of a (better) justification (see sec. 4.4), the BSM account as described so far can invalidate Davey's counterexamples (discussed next).

4.3. Micro- and Macroreference Classes

The explanans presented above proposes to explain the entropic behavior of a macroscopic system left isolated, which defines a macroreference class by relying on its phase space, thus also defining a microreference class. And it proposes to do so with a probabilistic explanation in which it is extremely probable that such a system is in one of the microstates that deterministically guarantee that it will later instantiate a high-entropy state. Within this explanatory framework there is an obvious choice of probability ascription procedure. It is, in effect, a case of interlevel synchronic probability where the outcomes (ai) from the microreference class are determined by the macroreference class (B).

It is thus not a case of intralevel synchronic probability where, for example, many similar glasses at a given time have a certain entropy state. Such a probability refers to the macroscopic conditions, that is, the possible interventions on the macroscopic systems. In effect, the probability of being in a particular macrostate must refer to the conditions that affect the possible macrostates. It thus has nothing to do with BSM's explanandum (A). This is why we do not need to form and much less justify any belief about which probability measure describes the system before being isolated. This is also why we do not need to ascribe any probability to a given macrostate at a particular time. In fact, this omission is part of both the explanandum and the conditions (Ci) of the explanans because any interaction with a system leads it to (very likely) be in a microstate that will evolve toward a high-entropy state if left isolated (this is discussed later). Consequently, it is not necessary to obtain a successful explanans to be specific about phase space: there is no need to ascribe a probability to any specific subregion of phase space or to form a “justified belief about the probability with which our system lies in any [specific] subregion of phase space”? (Davey Reference Davey2008, 34).Footnote 13 Therefore, the “Basic Claim”?—”?If a region of phase space is large, then it is probable for a physical system to find itself there”? (30)—does not need to be substantiated, as Davey's first purported counterexample tries to do (see sec. 2.2).

The important shift in Davey's argument in the second counterexample occurs when he ascribes a probability value based on the class of macroscopically similar systems (i.e., the glasses of water) and not on the phase space. The reference class for the determination of the entropy value and that for the determination of the probability value are different. In other words, Davey chooses intralevel instead of interlevel synchronic probability, and these do not coincide. What happens to other similar systems (i.e., intralevel synchronic probability) is thus irrelevant to the microphysical evolution of an isolated system, that is, the explanandum A. Therefore, the reference class of glasses under government fiat cannot possibly determine the probability that an individual system is in a certain microstate and cannot support the expectation of the explanandum, which invalidates the second counterexample.

Furthermore, it is also not a case of diachronic probability where, for example, a given glass once left isolated will evolve toward either a high- or a low-entropy state or any other state for that matter. Should thermodynamic systems be described by such a probability, they would be considered as indeterministic as quantum systems, contrary to one fundamental presupposition of SM. Another application of diachronic probability is to be found in the objection (e.g., Frigg Reference Frigg and Suàrez2010b) that ‘typicality’ is impotent on the basis that nonequilibrium states do not evolve into equilibrium states simply because there are overwhelmingly more of the latter. It is rather that there are overwhelmingly more nonequilibrium states (or phase space points) that evolve into equilibrium states (or phase space points) than nonequilibrium states (or phase space points) that do not.

Suppose now that a die is in a macrostate defined as showing one uppermost. There is no way that we can infer its intralevel synchronic probability based on the reference class of all similar dice in the world in any particular moment in time. Thus, intralevel synchronic probability is not the right reading underlying the use of ‘probable’ in Davey's Basic Claim. If the conditions of this event include very specific interventions on all the dice in the world to obtain (or hinder) the occurrence of showing one uppermost, then all these interventions would have to be known. Similarly, if, for a given macrostate, these conditions include a very specific dynamic intervention (e.g., knocking the die in a specific place, with such and such friction), then the diachronic probability would be quite different to the usual 1/6 attributed to every face and would not be explanatory. Yet, the BSM account is agnostic with regard to the specific macrostate a system can be prepared in, the specific dynamics involved, and how long the system is allowed to stay isolated. These specific features of thermodynamic systems are worth investigating, but they are not necessary for the BSM's explanatory strategy.

4.4. Justifying a Probability Measure

What then justifies the choice of the standard uniform or Lebesgue measure as the probability measure on phase space in the BSM account? A first, preliminary and general answer is that there is no good answer, for justification in a strong sense would amount to solving the probability ascription problem and correspondingly the problem of induction (sec. 3). As a more specific answer, I propose two arguments, which may be labeled ‘generic’ and ‘particular’, to justify the proper, uniform probability measure for the BSM account.

The generic argument relies on the flexibility of the BSM account in building a probability measure that can support the expectation criterion in the explanatory strategy with the combination {S, Ci, p}. In effect, it does not have to be the uniform measure (i.e., microcanonical or Lebesgue measure) specifically, for all that is needed is that p supports the expectation within the explanatory strategy according to the interlevel synchronic probability. It remains true that the only way to determine the ‘real’ probability measure would require very specific, microscopic details about the systems in question, but this flexibility is a compelling reason to leave aside the criticisms targeting the principle of indifference to the effect that it fails to deliver a unique probability measure. In fact, uncertainty about the microscopic details is not an obstacle to, but rather an essential element of, the probabilistic explanatory strategy. The successes of BSM are due in large part to its abstraction from the details of the systems it purports to describe (Khinchin Reference Khinchin1949). This flexibility can be seen as a case of universal or asymptotic explanation in which the microscopic details are set aside (see Batterman Reference Batterman2002). Therefore, this flexibility plays a role in the explanans, which supports the uniform probability measure.

The particular argument relies on the typicality of the BSM account, with typicality taken in the common sense, as something happening in the vast majority of cases and also as the opposite of special cases. Of course, typicality comes with a vast literature, filled with both criticisms (e.g., Frigg Reference Frigg and Suàrez2010b) and justifications (e.g., Werndl Reference Werndl2013). My point here is this: since in experiments one has the freedom to prepare a system into all kinds of different initial microstates on the energy hypersurface, no region of phase space should be assigned a higher probability than any other. Of course, this claim could be justified on purely inductive grounds, because any preparation puts a system into a microstate that will evolve to equilibrium, but the argument is stronger if the justification is situated within our explanatory framework. In fact, the usage of the principle of indifference and equiprobability is restricted (see Castell Reference Castell1998) here because it is applied to a set of systems with strong similarities (i.e., microstates in the canonical phase space). Moreover, the consequence of a very special, nonuniform probability measure could be a system that will (more) likely be in a microstate evolving toward a special condition, for example, gas particles concentrated in the corner of a container, thereby violating the second law of thermodynamics. Therefore, a nonuniform probability measure does not support the expectation of the instances we observe or, consequently, the explanans of the BSM account.

At this point, if the explanandum, the explanans, and the relevant micro- and macroreference classes are indeed justified but a justification based on inductive grounds or the principle of indifference is ruled out, then it is seems impossible to ascribe in a justified manner a probability measure in any probabilistic explanation whatsoever. The common example of throwing dice would then be impossible to explain. The principle of indifference is still objectionable, but its flexible and restricted usage within a justified explanatory framework (as presented here) remains conservative. At least, this framework provides the reasons why (i.e., it explains in the sense provided above) it is likely for A to occur.

Having said all that, the entropy of the system is also overwhelmingly likely to be increasing toward the past. Again, a vast literature has tackled this problem (see, e.g., Albert Reference Albert2000). Here, I will say only this: it is not to deny the very existence of the system in the past, but rather that the relevant period of time of the explanandum A starts precisely with an intervention providing, or preparing, a non-maximal-entropy macrostate because before it was not isolated; hence, that time is irrelevant as far as the BSM account is concerned. After all, the BSM's entropy principle is not necessarily bound to explain all kinds of asymmetry or “everything that is supposedly deteriorating or getting worse”? (Denbigh Reference Denbigh1989, 323). Many problems with SM remain to be solved, even with a probability measure, but if the above arguments are valid, a justification of the probabilistic explanation (although not a proof of the entropy principle) is available.

4.5. Proviso: Trivialization of the Entropy Principle

A probabilistic explanation of thermodynamic entropy has been proposed, but a proof of the entropy principle as a deduction from dynamic premises alone is still remote. However, it can be quite unsatisfactory to support this proof with very special initial conditions, such as the supposed low-entropy initial state of the universe. Here is a rendition of what some might be tempted to do in explaining the time asymmetry of entropy from a kind of information asymmetry between the past and the future: from a known initial state we generally infer that the evolution of certain systems (e.g., dice) can lead to any outcomes in a predefined set; that is, we deem ourselves ignorant of which specific state will occur from among these possible outcomes. Given this asymmetry, if we define a phase space function such as statistical entropy, it can only increase because its independent variable goes from a low to a high value; that is, it evolves from a small phase space (the known specific state) to a larger one (of all possible outcomes). This is what can be called the ‘trivialization of the statistical entropy increase’. But it can hardly be deemed a proof of the entropy principle. Perhaps future work in the field will devise a full-fledged satisfactory mathematical explanation based on something like ergodicity or typicality—or else answers to the ‘very hard questions’ Goldstein (Reference Goldstein and Bricmont2001) alluded to. But even without the latter, or without what can be deemed a proof or even a strong metaphysical explanation of the entropy principle, a sound probabilistic, universal explanation of it, as seen above, is available.

5. Conclusion

Skepticism will remain plausible in matters of probability ascription and scientific explanation because more stringent requirements remain possible in many scientific endeavors. Nonetheless, there is a strong, explanatory connection between entropy and probability. This connection is based on fundamental assumptions (such as the one about the underlying causal structure that determines macrostates) and on general explanatory conditions, along with a disambiguation of the notion of probability. These explanatory features circumscribe the choice of macro- and microreference classes as well as of a flexible, albeit restricted, usage of equiprobability.

Footnotes

1. A system can be composed of a smaller system and some part of the latter's environment and can be considered as isolated. Two bodies of different temperature that come into direct thermal contact involve a spontaneous flow of heat and an entropy increase. And the system composed of a cooler body and a hotter one can be considered isolated but with increasing entropy. We can thus say that a spontaneous change (i.e., one that is not driven by doing work) in energy entails an entropy increase. The asymmetry alluded to above captures a distinction between ‘natural/spontaneous’ change and ‘forced/artificial’ change, and thus it accords with the intuition that cups of coffee cool by themselves while refrigerators need to be plugged in in order to work. It is also behind the clauses “by itself”? and “without an accompanying change elsewhere”? in Rudolf Clausius's (Reference Clausius1854, 138) versions of the second law of thermodynamics.

2. This principle is variously referred to as the ‘Boltzmannian version’ of the second law of thermodynamics (Callender Reference Callender1999), ‘Boltzmann's law’ (Frigg Reference Frigg, Ernst and Hütteman2010a), or the ‘Principle of probable equilibration’ (Brown, Myrvold, and Uffink Reference Brown, Myrvold and Uffink2009).

3. There is a similar argument in Albert (Reference Albert2000).

4. As Ismael (Reference Ismael2009, 105) puts it, “Looking at frequencies will provide evidence for probabilities, but as a logical matter they do not determine the probabilistic facts.”?

5. Here, I insist on the objective-subjective distinction, which focuses on the epistemic tools that we have: ‘objective’ then reflects what has empirical support or what is mathematically deduced, and ‘subjective’ refers to the arbitrary components of our inferences. Of course, many discussions of probability insist instead on the ontological-epistemic distinction, where it is supposed that ontological claims can be evaluated independently of our epistemic tools.

6. Leeds (Reference Leeds2003) distinguishes “probability of becoming”? and “probability of being.”? Davey (Reference Davey2011) rather proposes another distinction with “instantaneous probability,”? when the probability is “specific to a particular time,”? and “noninstantaneous probability”? otherwise. Both can be related to the diachronic probability, but they do not intervene in the arguments here.

7. Handfield (Reference Handfield2012) calls ‘actualism’ the doctrine stipulating that chances are reducible to evidence, to facts about real events, to ‘this-wordly’ phenomena.

8. As acknowledged by Strevens (Reference Strevens and Sklar2014, 42), in SM “probability is introduced into an explanation not because it is either metaphysically or epistemologically unavoidable, but because it enhances the explanation, providing more insight than the deterministic alternative.”?

9. The determination of the possible outcomes of an experiment does not necessarily follow a diachronic procedure, even though the determination of the probability of these outcomes does. For example, the simple observation of a die, alongside symmetry arguments, can suffice. Also, some experiments have infinite possible outcomes, e.g., ‘recording the distance from the bull's eye to the point where a dart, aiming at the bull's eye, actually hits the plane’, where a diachronic procedure to determine all possible outcomes would take infinite attempts.

10. This operator can be implicit, just as with the die example in which it does not determine the sequence of states between the throw and the result. In quantum mechanics, by contrast, the Schrödinger operator is explicit, but there is the famous breakdown between the deterministic temporal evolution and the probability measure. This formal definition of probability ascription could be reformulated in the quantum mechanics formalism: Tt would be the Schrödinger operator of course, n would be the dimension of the quantum phase space, Ei would be the state vector, and the ai the projections of this state vector.

11. Shech (Reference Shech2013, 603) calls these systems “problematic systems.”?

12. Many Boltzmannians such as Goldstein would deny a certain role played by probability, namely, that entropy should be a function of probability density (Gibbs) instead of a function of phase space. But such Boltzmannians will nonetheless accept a probabilistic explanation based on the ‘typicality’ of the initial conditions (e.g., Goldstein Reference Goldstein and Bricmont2001, 53). This is discussed below.

13. Of course, the endorsement of a specific uniform probability measure (e.g., Lebesgue's) would lead to the ascription of a specific probability value to arbitrary subregions of that space. But such a specific value is not necessary for our probabilistic explanatory scheme to work, since equivalent, uniform probability measures could work. The question whether a uniform probability measure could ascribe a high probability to nonequilibrium microstates (and thus contradict this scheme) will have to await future work.

References

Albert, David Z. 2000. Time and Chance. Cambridge, MA: Harvard University Press.Google Scholar
Batterman, Robert W. 2002. The Devil in the Details: Asymptotic Reasoning in Explanation, Reduction, and Emergence. Oxford: Oxford University Press.Google Scholar
Brown, Harvey R., Myrvold, Wayne, and Uffink, Jos. 2009. “Boltzmann’s H-Theorem, Its Discontents, and the Birth of Statistical Mechanics.” Studies in History and Philosophy of Modern Physic 40:174–91.Google Scholar
Callender, Craig. 1999. “Reducing Thermodynamics to Statistical Mechanics: The Case of Entropy.” Journal of Philosophy 96:348–73.Google Scholar
Capek, Vladislav, and Sheehan, Daniel P.. 2005. Challenges to the Second Law of Thermodynamics: Theory and Experiment. Dordrecht: Springer.CrossRefGoogle Scholar
Carnap, Rudolf. 1966. Philosophical Foundations of Physics. New York: Basic.Google Scholar
Carnap, Rudolf. 1977. Two Essays on Entropy. Berkeley: University of California Press.CrossRefGoogle Scholar
Castell, Paul. 1998. “A Consistent Restriction of the Principle of Indifference.” British Journal of Philosophy of Science 49:387–95.CrossRefGoogle Scholar
Clausius, R. J. E. 1854. “On Another Form of the Second Principle of the Mechanical Theory of Heat.” Annalen der Physik und Chemie 93:481506, trans. Philosophical Magazine, ser. 4, 12, no. 81 (1856).CrossRefGoogle Scholar
Davey, Kevin. 2008. “The Justification of Probability Measures in Statistical Mechanics.” Philosophy of Science 75:2844.CrossRefGoogle Scholar
Davey, Kevin. 2011. “Thermodynamic Entropy and Its Relation to Probability in Classical Mechanics.” Philosophy of Science 78:955–75.CrossRefGoogle Scholar
Denbigh, Kenneth G. 1989. “Note on Entropy, Disorder and Disorganization.” British Journal for the Philosophy of Science 40:323–32.CrossRefGoogle Scholar
Frigg, Roman. 2010a. “Probability in Boltzmannian Statistical Mechanics.” In Time, Chance and Reduction, ed. Ernst, G. and Hütteman, A., 92118. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Frigg, Roman. 2010b. “Why Typicality Does Not Explain the Approach to Equilibrium.” In Probabilities, Causes and Propensities in Physics, ed Suàrez, Mauricio, 7793. Dordrecht: Springer.Google Scholar
Frigg, Roman, and Werndl, Charlotte. 2011. “Entropy: A Guide for the Perplexed.” In Probability in Physics, ed. Beisbart, C. and Hartmann, S., 115–42. Oxford: Oxford University Press.Google Scholar
Goldstein, Sheldon. 2001. “Boltzmann’s Approach to Statistical Mechanics.” In Chance in Physics: Foundations and Perspectives, ed. Bricmont, J., et al., 3954. Lecture Notes in Physics 574. Berlin: Springer.CrossRefGoogle Scholar
Goldstein, Sheldon, and Lebowitz, Joel L.. 2004. “On the (Boltzmann) Entropy of Nonequilibrium Systems.” Physica D 193:5366.Google Scholar
Hájek, Alan, and Hall, Ned. 2002. “Induction and Probability.” In The Blackwell Guide to the Philosophy of Science, ed. Machamer, P. and Silberstein, M., 149–72. Oxford: Blackwell.Google Scholar
Handfield, Toby. 2012. A Philosophical Guide to Chance. New York: Cambridge University Press.CrossRefGoogle Scholar
Ismael, Jenann T. 2009. “Probability in Deterministic Physics.” Journal of Philosophy 106:89108.CrossRefGoogle Scholar
Khinchin, Aleksandr I. 1949. Mathematical Foundations of Statistical Mechanics. New York: Dover.Google Scholar
Landau, Lev D., and Lifshitz, Evgueni M.. 1980. Statistical Physics. 3rd ed. New York: Pergamon.Google Scholar
Leeds, S. 2003. “Foundations of Statistical Mechanics: Two Approaches.” Philosophy of Science 70:126–44.CrossRefGoogle Scholar
Lieb, Elliott H., and Yngvason, Jakob. 2003. “The Mathematical Structure of the Second Law of Thermodynamics.” arXiv, Cornell University. https://arxiv.org/abs/math-ph/0204007.Google Scholar
Penrose, Oliver. 1979. “Foundations of Statistical Mechanics.” Reports on Progress in Physics 42:1939–97.CrossRefGoogle Scholar
Railton, Peter. 1981. “Probability, Explanation, and Information.” Synthese 48:233–56.CrossRefGoogle Scholar
Shech, Elay. 2013. “On Gases in Boxes: A Reply to Davey on the Justification of the Probability Measure in Boltzmannian Statistical Mechanics.” Philosophy of Science 80:593605.CrossRefGoogle Scholar
Sklar, Lawrence. 1973. “Statistical Explanation and Ergodic Theory.” Philosophy of Science 40:194212.CrossRefGoogle Scholar
Sklar, Lawrence. 1993. Physics and Chance. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Strevens, Michael. 2014. “Probabilistic Explanation.” In Physical Theory: Method and Interpretation, ed. Sklar, L., 4062. Oxford: Oxford University Press.CrossRefGoogle Scholar
Uffink, Jos. 2007. “Compendium of the Foundations of Classical Statistical Physics.” In Philosophy of Physics, pt. B, ed. Butterfield, J. and Earman, J., 9231074. Amsterdam: Elsevier.CrossRefGoogle Scholar
Werndl, Charlotte. 2013. “Justifying Typicality Measures of Boltzmannian Statistical Mechanics and Dynamical Systems.” Studies in History and Philosophy of Modern Physic 44:470–79.Google Scholar