1. Introduction: Chance, Selection, and Drift
On the face of it, evolutionary explanations and predictions are probabilistic. Natural selection and genetic drift can predict and explain the dynamics of populations—how genetic frequencies, genotypic frequencies, or phenotypic frequencies change over time. Given an initial distribution of genes, genotypes, or phenotypes, with realistic parameter values we are at best able to project a probability distribution of the relevant states from one time to another. This is associated with what Sewall Wright (Reference Wright1931, Reference Wright1932) originally called “genetic drift,” the effect of chance on gene frequencies across generations (cf. Roughgarden Reference Roughgarden1979; Beatty Reference Beatty1984, Reference Beatty1990; Sober Reference Sober1984; Falconer Reference Falconer1989; Hodge Reference Hodge1990; Skipper Reference Skipper2006, in this issue). Drift is a “matter of chance,” as John Beatty said, “in a sense in which evolution by natural selection is not” (Reference Beatty1990, 273). Genetic drift is sometimes described as the “error” in the transmission of types from generation to generation, arising from finite population size. There is sampling error of this sort when the frequency of genes, genotypes, or phenotypes is different in the offspring generation from the frequency in the parental generation, and that difference is not due to selection, mutation, or migration. In the absence of selection, drift can explain the pattern of fixation among populations. It can explain changes in the genetic, genotypic, or phenotypic frequencies within individual populations. In the presence of selection, drift can explain the deviations from the pattern that would be predicted by natural selection alone and can contribute to explanations of outcomes in particular cases (see Wright Reference Wright1932, Reference Wright1945; Kerr and Wright Reference Kerr and Wright1954; Kimura Reference Kimura1964; Lande Reference Lande1976).
This much should be uncontroversial. Chance has a genuine role in evolutionary explanations, though that does not mean that evolutionary change is random or fundamentally indeterministic. Embracing a role for chance also does not mean embracing a neutralist position, according to which much or most evolutionary change is selectively neutral (see Dietrich Reference Dietrich2006, in this issue). Neutralism incorporates an important and obvious role for chance, but chance plays an evolutionary role even in the absence of neutralism. One need not be a neutralist to embrace drift or chance, though neutralists are committed to a fundamental role for chance. Selection and drift can both influence the evolution of a population and the evolution of a single phenotypic trait (cf. Beatty Reference Beatty1984; Millstein Reference Millstein2002). What is controversial—within philosophical circles, though generally not in biological ones—is whether evolutionary explanations are fundamentally probabilistic, that is, whether the probabilities are not at root epistemological and whether the process is fundamentally deterministic (cf. Rosenberg Reference Rosenberg1988, Reference Rosenberg1994; Horan Reference Horan1994; Brandon and Carson Reference Brandon and Carson1996).
2. Drift in Theoretical Ensembles
Genetic drift is routinely modeled mathematically in terms of ensembles of finite populations. The mathematical ideal is an infinite ensemble. Effectively, we ask what would happen if we had the same evolutionary problem repeated. If we know the exact initial condition for some population and we know that some descendant group is drawn from it, we ask what is likely were the descendants drawn from the original group by chance. Given a chance setup, we can predict the likelihood of a given outcome from chance alone. This is modeled in terms of a draw from a metapopulation. The likelihood of a given outcome is the probability of that specific draw from an effectively infinite metapopulation. The abstract problem can be instantiated biologically. Given a single gene with two alleles at frequencies $p_{i}$ and $p_{j}$ , in the absence of selection, ensembles of populations initially polymorphic at that locus will tend to disperse across a wide range, from populations fixed for one allele to populations fixed for the alternative allele (see Kimura Reference Kimura1964; Roughgarden Reference Roughgarden1979; Falconer Reference Falconer1989; Richardson Reference Richardson2001). These changes are random within each subpopulation if there is no selection, immigration, emigration, or mutation operating; as a result, different subpopulations become differentiated. Additionally, there will be no change in the allelic frequencies in the overall population. This differentiation among subpopulations is a chance affair. Analogously, with a fair coin, we expect a 50-50 division between heads and tails over a suitably long run. With a run over 1,000 flips, we might approximate the expected result. However, if we break the sequence into a series of 10 run trials, we expect to see a great number of sequences that deviate from the 50-50 expectation. These will form a normal distribution around the expected value. Most, in fact, should deviate from the expectation. In the evolutionary case, since the extremes in which subpopulations become fixed for one allele or another are “absorbing” states (in the absence of mutation and immigration), as the ensemble disperses over the space, each population will eventually become monomorphic. Once monomorphic, they remain fixed. In the theoretical limit, drift predicts that the ensemble of populations will bifurcate into a bimodal distribution with every population fixed for one allele at one of the two extremes. Short of that limit, the ensemble of populations will disperse across the range, with a rate depending on the size of the populations.
Chance has a role in explaining both the general pattern and the individual case. So given a set of populations, drift can explain the pattern of fixation; it can also properly explain the fixation of specific alleles within individual populations. Given an initial degree of polymorphism, we have a precise prediction of the pattern we should expect: the overall frequency of the genes in the metapopulation should remain unchanged, while the (sub)populations become increasingly differentiated from one another; in the limit, all the populations become monomorphic; short of the limit we expect a distribution of increasingly monomorphic populations that reflects the initial genetic distribution. The frequencies of populations fixed for the alternative alleles, in the limit, should be the same as the initial frequencies of the alleles in the metapopulation. So if we begin with $p_{i}$ and $p_{j}$ as the frequencies of the alternative alleles, in the absence of selection and mutation those frequencies should remain unchanged in the metapopulation, even though the subpopulations become genetically uniform. Moreover, the frequency of populations fixed for the i and j genes should, respectively, be $p_{i}$ and $p_{j}$ . Finally, the rate at which populations disperse depends crucially on the sizes of those populations: smaller population sizes will yield more rapid diffusion across the space of distributions.
The neutral case is an ideal, assuming that there are no other evolutionary influences. Of course, there typically are other factors. With other factors affecting the evolution of phenotypic, genotypic, or genetic frequencies, the predictions must change accordingly. If we focus on selection alone, in the absence of drift, migration, and mutation, we have a deterministic process: given a frequency distribution at one time and fixed selection coefficients, under selection alone models predict a unique distribution of frequencies in the next generation. Mathematically, this is modeled using infinite population sizes in which sampling error, and therefore drift, could not occur. In an infinite population, these changes would be a deterministic function of fitness. This component represents the result of selection. Under directional selection in a population short of equilibrium, the result is an increase in the frequency of more fit individuals and in average fitness. More intuitively, this is the expected fitness. With finite populations, drift then can be thought of as exploring adaptive zones. In finite populations subject to selection, drift captures the extent to which actual changes in genetic, genotypic, or phenotypic frequencies tend to be uncorrelated with fitness differences; the variation around the norm defined by selection is random with respect to fitness. Fitness values then determine the strength and location of the central tendency within an ensemble of populations, and drift becomes the amount of dispersal around the mean value. The extent of the variation—the scatter around the norm—again is a function of population size (cf. Wright Reference Wright1931, Reference Wright1932; Lande Reference Lande1976; Beatty Reference Beatty1984; Sober Reference Sober1984; Hodge Reference Hodge1990; Richardson and Burian Reference Richardson, Burian, Fine, Forbes and Okruhlik1992).
In a wonderful essay entitled “Dobzhansky and Drift” (Reference Beatty1990), Beatty explored the attitude of Theodosius Dobzhansky toward the role of drift in evolution. As he says, when some evolutionary change is due to drift, it is a “matter of chance” whether the frequency of a trait increases or decreases. He explains that Dobzhansky changed his attitude toward the significance of chance and explains why Dobzhansky changed his attitude. Dobzhansky saw that if natural selection were omnipotent, it should deplete natural variation in populations. He was convinced well before he moved to the United States that there was substantial variation in natural populations and that that was an anomaly to be dealt with. The problem was how intraspecific variation was maintained in the face of selection. In the first edition of Genetics and the Origin of Species (Reference Dobzhansky1937), Dobzhansky came to recognize the importance of population structure. Following Wright, he saw that if a species is divided into a mosaic of relatively isolated populations, then those populations would be expected to become differentiated from one another genetically. He thought at the time that this could happen by chance. If local breeding populations are reasonably small, then not only could this happen by chance, it should. Wright thought that this would explain the nonadaptive differences we see among populations and species. He also thought that this would explain differences in the adaptive potential of species: though each population would become more uniform with time, the species would have increased variance; and with increased variance came the potential for adaptation. For various reasons, Dobzhansky departed from drift in favor of selectionist explanations of how variation is maintained, finally settling on heterozygote superiority as an explanation of the pattern. The reasons for the shift are the topic of Beatty's essay. I cannot improve on his discussion. The simple point I want to draw from Beatty's treatment is the recognition on Dobzhansky's part that chance could lead to differentiation, and the role that can play in evolutionary theory. Dobzhansky recognized that selection without variation is certainly empty. It is also true that Dobzhansky's concern with maintaining the potential for adaptation is not especially well served by drift, though this point is not Beatty's. Drift is more efficient when the alternatives are neutral. I'll turn to a study of neutral variations in the next section, with blood group variation in humans. The key conclusion I want to note is that drift won't explain how we maintain adaptive potential. It can explain how some level of variation is maintained, but that variation is no more likely to be adaptive than not. Variation without selection is blind, which is nearly what Darwin thought (Beatty Reference Beatty2006, in this issue).
Explanations in terms of drift are consistent with any number of more specific historical scenarios. We may assume that in each case, there is some cause that determines the particular outcome. Even if there is a deterministic explanation available in each case, there still may be reasons to prefer the more abstract probabilistic explanation that ignores the specific causes. Elliot Sober observed that there is a gain in appealing to chance:
The choice between deterministic and stochastic modeling involves a certain trade-off. The former brings with it enhanced predictive power; the latter proves greater scope for generality. By allowing a role for chance, we concede that physical systems may be in the same state at one time, even though they may differ in a subsequent state. By specifying the causal factors so completely that chance is eliminated from the model, we end up describing the system in such a way that it would be remarkable if many other systems proceeded in the same way. (Reference Sober1984, 128–129)
If Sober is right, then there is a theoretically significant reason for appealing to deterministic explanations. We gain increased predictive power. If he is right, there is also a theoretically significant reason for relying on probabilistic explanations. We gain greater generality. The point I want to draw from this is not quite Sober's but is consonant with it: that knowing the particulars will not dispense with statistical models in understanding the patterns we observe. If we knew all the influences in the specific cases, we would be able to explain each instance. We could explain the trajectory of each population. However, this is only half of the problem. The individual cases are inadequate for comprehending the overall pattern. Indeed, they impede our understanding of the overall pattern. We could explain the individual cases, but the pattern would be lost. We could, for example, trace each member of an ensemble through its myriad changes, explaining why, say, it came to have a prevalence of one allele over another; but we would have no explanation for the pattern exhibited by the ensemble.
3. The Parma Valley Study
We can see this explanatory strategy at work in actual cases, and not just in theoretical ones. The classic studies of blood group distributions in Italian populations in the Parma Valley of Italy were conducted by Luigi Luca Cavalli-Sforza and his collaborators, including Franco Conterio, Antonio Moroni, Italo Barrai, and Gianna Zei (Barrai, Cavalli-Sforza, and Moroni Reference Barrai, Cavalli-Sforza and Moroni1962; Cavalli-Sforza Reference Cavalli-Sforza1966, Reference Cavalli-Sforza1969; Cavalli-Sforza, Kimura, and Barrai Reference Cavalli-Sforza1966; Cavalli-Sforza and Zei Reference Cavalli-Sforza, Zei, Crow and Neel1967). They exhibit the pattern predicted and explained by drift, as described in Section 2, given the assumption that blood types are (more or less) neutral. Cavalli-Sforza and Moroni initiated this research in the early 1950s and have only very recently published a definitive treatment of the case (Cavalli-Sforza, Moroni, and Zei Reference Cavalli-Sforza, Moroni and Zei2004). Cavalli-Sforza and Moroni saw that there was an opportunity to compare the effects of drift in a set of human populations with independently established assessments of inbreeding in villages in the Parma Valley (see Table 1).
Region | Alternative Alleles | ||
---|---|---|---|
A | B | O | |
Europe | 27 | 8 | 65 |
English | 25 | 8 | 67 |
Italians | 25 | 8 | 67 |
Basques | 23 | 2 | 75 |
East Asia | 20 | 19 | 61 |
Africa | 18 | 13 | 69 |
American natives | 1.7 | .3 | 98 |
Australian natives | 22 | 2 | 76 |
Note.—Under the assumption that blood types are selectively neutral, the differences from region to region would be a function of drift. Notice, in particular, the virtual absence of A or B among native Americans and the reduced frequency of B among Basques and native Australians. These are plausibly a founder effect, a form of drift. The data are drawn from Cavalli-Sforza, Moroni, and Zei (Reference Cavalli-Sforza, Moroni and Zei2004).
This is a classic study, not only in that it was important historically as a demonstration of the importance of drift in natural populations, but also in that it focuses on classical (protein) polymorphisms. At the time, the significance of drift was uncertain. Their work showed that it is significant in establishing differences among human populations. The study was originally a case of drift among phenotypic characters—blood group types—though these have a well-understood genetic basis and it is possible to look as well at genetic differences. Generally, among Europeans, we see that about 5% of individuals are type AB, 15% are type B, and 40% are types A and O. There is a similar pattern among allelic frequencies. These numbers vary somewhat from place to place, as would be expected.
The Parma Valley is located in northern Italy. Within the Parma Valley, there is considerable variation in patterns of habitation, from smaller, less mobile populations in the mountains to larger, more mobile populations on the plain, with an intermediate region that is hilly. In the higher regions, the villages are smaller (averaging around 300 inhabitants); downstream, the villages become larger as the topography levels out (averaging 600–800 inhabitants). The effective population sizes actually vary even more than this suggests, since mobility is greater in the downstream populations, which would increase the effective population size and therefore decrease the rate of divergence.
The area was settled very early, well before historical records, and at least until recently there had been no major immigrations for hundreds of years. This suggests that the population should be at or near an equilibrium, which would make issues easier to address: whatever natural factors affect population structure should be stable, and the influence of factors such as drift, selection, and migration should not be in flux. If settlement had been more recent, we would expect a gradient, but that seems not to be realistic. In larger populations, the effects of drift should be reduced since larger effective populations would mean more interbreeding; moreover, since migration between villages is easier in lower altitudes, that too should have a homogenizing influence. This is effectively an elegant natural experiment, displaying an ensemble of populations that presumably started from the same or similar blood group frequencies—whatever the frequencies were in the founder populations. We can think of the case as if each village were a random draw from an ancestral population, and the villages become an ensemble of such draws. No doubt this is not a wholly realistic picture. It is at least close enough that we should see statistically significant differences, and we do. If drift is more significant in smaller villages, then we should see more differentiation among them even if they were originally similar or drawn from a homogeneous population. Cavalli-Sforza and Moroni measured the frequencies of blood types within 74 villages and calculated the genetic variance among villages, within subgroups depending on elevation (see Table 2). As one would expect, the smaller and more isolated villages have diminished genetic variance within individual villages by comparison with the metapopulation. They tend to be more uniform with respect to blood type. Moreover, the variance among villages is greater in the mountains than in the plain. The smaller villages differ more among themselves than the larger villages do. Cavalli-Sforza, Moroni, and Zei report that the variation among villages was nearly nonexistent in the valley, which means that the variation between the averages of villages was not discernible against the background variation within villages (see Figure 1).
Region (Grouped by Altitude) | Alternative Alleles | ||
---|---|---|---|
A | B | O | |
Plains and hills | 28.5 (.7) | 7.5 (.6) | 63.9 (1.3) |
Mountains | 30 (.9) | 8.9 (.1) | 61.1 (.2) |
Note.—The A, B, and O frequencies in the mountains are significantly different from both the European and Italian frequencies as exhibited in table 1. (Parentheses indicate standard errors.) The data are drawn from Cavalli-Sforza (Reference Cavalli-Sforza2000).
The population of the valley as a whole appears to be very nearly in Hardy-Weinberg equilibrium and nearly (though not quite) reflects the Italian averages for blood group types. That would suggest that natural selection is not a factor. We see blood groups, and allelic differences, distributed randomly. Nonetheless, the variation in blood group types among villages is higher in the mountains than in the hills and virtually disappears on the plain. That is, though blood group frequencies in the metapopulation are reasonably stable, differences among villages are discernible. This is precisely the pattern we expect with neutral variation and drift. In these cases, the effects of drift are masked by immigration and emigration among villages. The villages are not insulated genetically. There is migration, so even at the higher elevations we would expect moderation of the effects of drift. What we look for is the deviation from the overall frequencies within villages. What drift can explain is the tendency for villages higher up to be more differentiated than villages at lower elevations. Again, the rate at which this happens depends on the size of the village and the immigration rates among villages. Given the overall distribution, we treat individual, isolated, villages as if they were random samples drawn from the metapopulation.
There is another explanation for the individual cases. Drift is the effect of chance on genetic, genotypic, or phenotypic frequencies—in this case, blood groups—as a consequence of small population size. When the effective population size is small, that will make mating among consanguineous individuals (e.g., cousins) more likely. In this case, we can look at the social structures within the villages, ultimately constructing pedigrees that will also explain why the population became fixed or skewed for a specific allele. Inbreeding tends to reduce genetic differences within villages, while migration tends to increase this variation. Inbreeding tends to increase genetic differences between villages, while migration tends to decrease these differences. Cavalli-Sforza and Moroni turned to the parish registers to gauge the frequencies of consanguineous marriages (see Figure 2). This was possible because the Catholic Church required petitions for marriages within families and clearly defines which sorts of relationships require a dispensation (specifically, cousins), as well as which are strictly forbidden (i.e., marriages between siblings or parents and their children). Dispensations were generally granted except in the forbidden cases, and the reasons offered were quite varied. The result was that parish records gave relatively complete information concerning relatedness. These records confirmed the relative isolation of smaller villages. With pedigrees in hand, we have another explanation of blood type frequencies in particular cases. This should be convergent with the direct assessment of blood group frequencies. There are various factors that influence the tendencies of people to marry relatives, including not only population size but also age and custom. With Walter Bodmer and Mootoo Kimura, Cavalli-Sforza simulated the effects of these factors. It turns out that the frequency of consanguineous marriages is roughly what would be expected as the result of chance (at least for other than the most immediate relatives). This allows for predictions of variation in blood groups across the entire population as a function of time. What they found is that, nearly enough, the values predicted from the population models matched the observed blood group variations in the villages. Chance alone was enough to achieve the observed result. We gain nothing in terms of the prediction of the overall pattern by appealing to the causes that doubtless determine the particular cases. We could, of course, explain each particular effect, but that is a different question.
The results confirmed the importance of drift as the factor responsible for blood group differences. Notice again that what is predicted, and what is observed, is the genetic variation among villages—an ensemble property. What Cavalli-Sforza and his collaborators found was levels of divergence consistent with drift, balanced by the effects of migration. They did not attempt to predict the specific genes (or blood types) that would be found in any one village, but were interested in the differences among villages as a function of population parameters. It is the pattern of fixation that is critical, as is the stability of the metapopulation. Given an initial degree of polymorphism in a population, we know the pattern to expect: the overall frequency of the genes should be in equilibrium, while subpopulations become increasingly divergent; moreover, the frequencies within the subpopulations for the alternative alleles should reflect the overall frequencies of the alleles. And this is what we observe.
4. Chance and Evolutionary Explanation
The appeal to drift in the Parma Valley case is an empirically vindicated use of chance in explaining the patterns of inheritance among blood groups. There might have been other explanations of the pattern, but appealing to those factors—such as age preference, local customs, or even Catholic restrictions on marriage—does not enhance our explanatory abilities. These are influences that no doubt would serve to explain the particular cases—why one couple marries and another does not, or even why some village has, say, an increased frequency of A alleles—but it does nothing to help us explain the pattern of variation among blood types, or why the variation increases as we move to higher elevations. The actual patterns can, of course, be aggregated to yield statistically observed patterns, but that is only a matter of description. This does not explain the patterns. To explain them we need to appeal to the balance between drift and the effects of migration. The key question is whether the factors explain the statistical behavior. This is different from describing the aggregate behavior. Explaining the statistical behavior involves explaining the pattern, whereas describing the aggregate behavior is accomplished merely by summing over the individual cases. When we focus on aggregate behavior, we simply average over the instances. When we focus on the statistical behavior, we focus on the expected behavior. This requires a level of abstraction not present even in focusing on aggregate behavior.
Discussions of the role of selection (fitness) and drift (chance) in the philosophical literature often suffer a disconnect from the issues that engage biologists. We are sometimes distracted by issues over the interpretation of probability, whether it is epistemic or objective. We are sometimes distracted by questions over propensities, and how to understand them. We are sometimes distracted by issues more metaphysical, such as whether the universe is deterministic at root, and how this might affect evolutionary theory. These are all distractions from the biological issues. Take this last issue, in particular. Here's a common thought: If chance plays a significant role in evolutionary theory, then there must be some fundamental indeterminism in the universe, and that would in turn require indeterminism at the most fundamental levels to affect evolutionary processes (cf. Brandon and Carson Reference Brandon and Carson1996; Glymour Reference Glymour2001; Graves, Horan, and Rosenberg Reference Graves, Horan and Rosenberg1999; Rosenberg Reference Graves, Horan and Rosenberg2001). Some who favor a role for chance suggest that indeterminacies could “bleed up,” an evolutionary reflection of Schroedinger's cat. Those who do not favor indeterminism in evolutionary theory suggest that the fundamental indeterminacies “average out.” I think that, in light of the actual role chance plays in evolutionary explanations, such as the one illustrated here, the disconnect of these issues with the biology is evident. The mistake is in framing the philosophical issue.