Altruistic behavior provides benefits to others while incurring a cost for the acting individual.1
The existence of such behavior has been a puzzle for scholars across the social and life sciences. To social scientists, choosing A over B is irrational if B provides higher individual utility as does non-altruism. To life scientists, it is no less puzzling how altruism could have survived in the evolutionary process. Formal models of altruistic and selfish behavior, most notably the Prisoners' Dilemma game and public good games, have further increased interest in the paradox and led to an unprecedented surge of interdisciplinary research.2 Early classical models predicted that rational behavior would lead to mutual defection, over-exploitation of resources, and, ultimately, decline into a Hobbesian world when the life of a man is “solitary, poor, nasty, brutish, and short.”3Hobbes 1947.
Hardin 1968.
Theoretical predictions of selfish behavior, however, are not supported by experimental research on public good games, which provides ambiguous results. In laboratory settings, subjects choose both cooperation and defection.5
Plott 1983; Isaac, Walker, and Thomas 1984; Orbell, Van de Kragt, and Dawes 1988; Ledyard 1995; Lubell and Scholz 2001.
Maynard Smith 1982; Fowler 2005; Boyd and Richerson 1992; cf. Axelrod 1997; Nowak and Sigmund 1998; Orbell et al. 2004.
Nowak 2006.
These interdisciplinary advances, however, have had a limited effect on actual policies for the management of common property resources. In order to avoid the tragedy of the commons, two standard policy prescriptions are adopted: non-altruism privatization of the common-pool resources, and centralized coercion by the government.11
Both policies presume that cooperation is impossible. Any possibility of decentralized self-enforcement of cooperation is quickly dismissed as not viable; individual costs of monitoring and enforcement can be very high while the benefits are divided among members of the commons. Policing, therefore, is also a public good, and the problem of collective action is merely shifted one step higher.12Elster 1989.
This article introduces the broad political science audience to the interdisciplinary research on altruistic punishment, which should aid our discipline in normative and positive analysis of the collective action problems. This work integrates the study of decentralized self-enforcement of cooperative norms as developed in the social sciences with research on evolutionary adaptedness of altruistic punishment as developed in the life sciences. The theoretical discussion of altruistic punishment is based upon rational choice and evolutionary game theoretic models. The parameters of such models, their underlying assumptions, and the conditions necessary for the maintenance of cooperation in them are further discussed through the prism of field studies in political science and laboratory experiments in economics and evolutionary psychology. On the basis of such analysis, the paper concludes with a normative discussion and suggestions of practical policy implications. Methodologically, the paper makes an attempt to show certain advantages of academic collaboration between political scientists and scholars representing other disciplines, from economics and computer science to biology and evolutionary psychology.
Definition of Altruistic Punishment
Altruistic punishment (AP) is a strategy according to which a cooperative individual punishes those who defect at a personal cost to oneself. Typically, but not necessarily, the cost of punishment is greater for the recipient (defector) than for the sender (cooperator who punishes). Under a common-property regime, altruistic punishers consume available resources maximizing social utility, and punish at a personal cost those individuals who overexploit common resources. Examples of such costly punishment include exiting mutually beneficial relationships, gossip, quarrel, ostracism, threats of violence, and actual use of force. Such acts of individual behavior can deter future defections, thus providing a public good of policing. The core of the problem, therefore, is the new dominant incentive to free-ride on the punishment and let others police defectors. Policing the police does not solve the problem for the same reason of merely shifting the issue one step further. Nevertheless, the manifest existence in the natural world of altruistic punishment—as an individual psychological response, a communal norm, and an element of institutional design—warrants examination of such behavior.
Close examination of altruistic punishment in theory and practice reveals that there are important asymmetries between classical cooperation and altruistic punishment. If individuals are willing to incur the cost of punishment, we observe the evolution of altruistic behavior in a more robust manner than in the traditional non-punishing game.13
Hence, the first question that we have to address is whether, in fact, individuals are willing to punish defectors—even when such behavior is seemingly irrational in the classic economic sense.Empirical Evidence
According to recent experimental evidence, the punishment of defectors is common in laboratory settings.14
Fehr and Gachter 2002.
Orbell, Van de Kragt, and Dawes 1988; Yamagishi 1986; Ostrom, Gardner, and Walker 1994; Price, Cosmides, and Tooby 2002; Fehr and Gachter 2002.
Fehr and Fischbacher 2004.
Field studies support this experimental research. Various forms of costly self-enforcement of cooperative behavior appear to be a widespread custom in communities around the world. From fisheries to irrigation systems to grazing lands to forests and wildlife, decentralized punishment of free-riding and overexploitation is a regular institutional arrangement devised to discourage opportunistic behavior.17
On fisheries see Acheson 1975; Dyer and McGoodwin 1994; Crean and Symes 1996; Leal 1996; Berkes et al. 2001. On irrigation systems see Tang 1992; Ostrom and Gardner 1993; Mabry 1996. On grazing lands see Netting 1981; Ellickson 1991; Anderson 1995. On forests and wildlife see Bromley 1992; Kibreab 2002. See also Ostrom 1990 and Ostrom, Gardner, and Walker 1994 for a comprehensive overview of self-governing commons.
A possible explanation of altruistic punishment is that emotions such as anger are responsible for the seemingly irrational and thus unsustainable behavior. Such an explanation, however, begs a number of questions: If we do have angry emotional responses to defection, where do they come from? If altruistic punishment decreases individual utility, how could such behavioral propensities have possibly evolved? What is so special about altruistic punishment that makes it a widespread communal custom as reflected in the field evidence, and also a strong individual propensity as reflected in the experimental research?
To answer this question we examine altruistic punishment through the three different prisms—rational choice, evolutionary adaptation, and normative evaluation. In brief, altruistic punishment is economically rational if pre-commitment is possible.18
Cf. chain store paradox as in Kreps and Wilson 1982; according to the chain store paradox a monopoly may commit itself to punishment of all entrants into the market in order to build reputation and deter future invasions.
Altruistic Punishment and Tit-For-Tat
Altruistic punishment may also be driven by normative considerations. Altruistic punishment as a mechanism that leads to cooperation must be normatively better than tit-for-tat (TFT), a retaliatory mechanism that entails punishment by defection. Axelrod Keohane, and other well-known political scientists describe TFT as the main reciprocity mechanism promoting cooperation.19
This has been a standard solution to the tragedy of the commons in political science, which is surprising given the importance of normative issues for political scientists and the major deficiency of TFT as discussed below.Although tit-for-tat can be an effective strategy leading to cooperation in theory, it is often inappropriate and ineffective in the real world.20
Axelrod and Hamilton 1981.
Terry L. Anderson, personal communication 2003.
Turnbull 1961.
Theoretical treatments of altruistic punishment and closely related behaviors are numerous. In political science and economics, such examples include the theory of strong reciprocity, models of quasi-voluntary compliance, the Norms game, punishment by exit, and some of the literature on sanctions.25
On the theory of strong reciprocity see Sethi and Somanathan 1996; Gintis 2000. On models of quasi-voluntary compliance see Levi 1988. On the Norms game see Axelrod 1986. On punishment by exit see Vanberg and Congleton 1992. For the literature on sanctions see Romer 1984, Shavell 1987, and Nossal 1989.
On the notion of moralistic aggression see Trivers 1971. On punishment in animal societies or negative reciprocity see Clutton-Brock and Parker 1995. On mutual policing see Frank 1995. On repression of selfishness in the context of group selection see Sober and Wilson 1998.
Altruistic punishment as an integrating explanation of the puzzle of human cooperation

The table by no means represents a complete picture, but the selection should give the reader some idea about the scope of interdisciplinary interest in the question. Despite the fact that altruistic punishment in its various forms receives much attention across disciplines, scholars within each field are typically not aware about advances in other fields and disciplines. Despite addressing exactly the same problem, albeit in different domains, the gap is most striking between the studies of altruistic punishment in social and life sciences, as evident from not citing each others work (interdisciplinary works, not surprisingly, are an exception).
Classical Prisoners' Dilemma Game and Altruistic Punishment
To illustrate the logic of altruistic punishment we turn to a simple game theoretic model. Economic rationality underlying altruistic punishment can be explained through the classic Nash equilibrium solution, while the long-term properties of altruistic punishment are better examined by means of replicator dynamics and the analysis of evolutionary stability of strategies. The purpose of this model (or any model, in fact) is to simplify reality in order to sharpen our intuition and capture non-trivial aspects of the problem. How can altruistic punishment be a solution to the tragedy of the commons? The theoretical output (prediction) that we obtain is further juxtaposed with empirical reality. And, finally, normative inferences follow the discussion of empirical implications of the theoretical model.
Costly self-enforcement is not possible in the well-known Prisoners' Dilemma (PD) game, which has been a standard model predicting the tragic outcome to commons situations. In the two-player version of the game, each player (prisoner) has two available strategies: cooperate (maintain silence) and defect (confess). Confessing to the authorities is a strictly dominant strategy, making mutual defection the only equilibrium. Although the model captures the crux of the problem of collective action, it appears that certain fundamental aspects of real world dilemmas are missing—as suggested here, punishment of defectors is common when it comes to public good games. Even the prisoners in the story are likely to be aware of potential retributions typical in the criminal world for not keeping silent. A simple extension of the classical PD adds another—cooperative—equilibrium to the game (see figure 1).

Prisoners' Dilemma with altruistic punishment
Note: T > R > P > S, 2R > T + S (standard Prisoners' Dilemma assumptions), and X > 0, k > 0, where: T is the temptation payoff to free-rider, R is the reward for mutual cooperation, P is the punishment for mutual defection, S is the payoff of an exploited cooperator, X is the cost of punishment for the altruistic punisher, and kX is the cost of punishment for the defector. Typically, but not necessarily, k > 1. Altruistic punishment is a Nash equilibrium for R > T − kX.
An altruistic punisher is a cooperator who sacrifices some of his utility in order to decrease the utility of a defection. In this game, mutual defection remains a Nash equilibrium—no player has an incentive to deviate from his strategy unilaterally. However, altruistic punishment is also a Nash equilibrium if the cost of punishment for the defector is greater than the extra benefit from free-riding. The mutual defection equilibrium corresponds to the tragedy of open access resources such as the open ocean fisheries.27
Meltzer 1994.
Empirical examples of successful long-term management of the commons imply that some communities manage to ensure that mutual cooperation provides greater utility than “punished free-riding”—that is to say, when following the rules of the community is more beneficial than free-riding and then being punished. Yet this may change. Field evidence also demonstrates that the cooperative equilibrium can de disrupted as a result of external factors such as natural cataclysms, infectious diseases, inter-group warfare, refugees, and state intervention. For example, droughts usually increase an individual incentive to free-ride and consume more water than others. The value of temptation is increased, and if more strict rules are not implemented, the cooperative equilibrium is in danger.28
It should be noted, however, that the mere presence of punishment options can have a profound positive effect on the level of human cooperation. Lubell and Scholz show experimentally that in a non-reciprocal environment even small penalties for defection can lead to higher levels of cooperation.29Lubell and Scholz 2001.
Ellickson 1991.
The cooperative equilibrium—existing due to altruistic punishment—gives both players higher utility than the mutual defection equilibrium. At the same time, altruistic punishment is weakly dominated by cooperation. In terms of individual utility, cooperation is at least as good as altruistic punishment, and sometimes (when defectors are present) even better. Ironically, cooperators in the model are, in fact, the “second-order defectors” since they do not punish defectors, that is, free-ride on the policing effort of altruistic punishers. While cooperation weakly dominates altruistic punishment, it itself is strictly dominated by defection in the absence of altruistic punishment. These dynamics raise a question whether altruistic punishment can be sustained as a social norm in the long run. To examine long-term properties of altruistic punishment we turn to evolutionary game theory.
Evolutionary Stability of Altruistic Punishment
The fact that altruistic punishment is a Nash equilibrium strategy does not imply that it is also an evolutionary stable strategy (ESS). A strategy, or a type, is evolutionary stable when all members of the population adopt it and no single individual has an incentive to adopt another strategy.31
Maynard Smith 1982.
Samuelson 1997.
Defection is the only evolutionary stable strategy in the modified PD game. Nevertheless, there is still a possibility for an extended but temporary coexistence of cooperators and altruistic punishers. Intuition rightfully suggests that such a heterogeneous population cannot exist forever. In the presence of defectors, cooperation provides greater individual fitness since cooperators do not incur the cost of punishment. As a result, the proportion of altruistic punishers relative to the proportion of cooperators will be decreasing over time, even if defection is deterred. Eventually, there will be so few altruistic punishers and so many cooperators that a mutant defector will take over the population. In fact, altruistic punishment is unstable even in the absence of defectors since it is only neutrally stable against mutant cooperators. As a result the population can randomly drift away from the social norm of altruistic punishment.
The analysis of replicator dynamics confirms our intuition (see figure 2). Formally, replicator dynamics are a system of equations representing the growth of types (strategies) in the population given their relative fitness.33

Replicator dynamics of the Prisoners' Dilemma game with altruistic punishment
Note: The corners of the simplex represent homogeneous populations of cooperators (C), defectors (D), and altruistic punishers (AP). In the AP-T region, the population is heterogeneous and consists only of cooperators and altruistic punishers. In this region, mutant defectors die out, yet cooperators have greater fitness than altruistic punishers. Below the threshold T, the proportion of cooperators becomes so large relative to the proportion of altruistic punishers that a mutant defector will have greater fitness than both cooperators and altruistic punishers and, therefore, the population will eventually converge to mutual defection (D).
In the short run, defection can be contained and the two surviving traits will be cooperation and altruistic punishment. Unfortunately for altruistic punishers and cooperators, the region AP-T is not stable since occasional defections will increase the proportion of cooperators at the expense of altruistic punishers. Eventually altruistic punishment will cease to exist as a social norm. Interestingly, the actual fitness difference between cooperators and altruistic punishers is very small. As a result the speed with which the proportion of cooperators increases relative to the proportion of altruistic punishers is also very slow (a tiny fraction of a percentage per generation). At the same time, a number of factors may easily reverse the dynamics. Examples of such factors include conformist pressure to punish defectors, multi-level selection, a “No Play” option, and a spatial structure.34
On conformist pressure to punish defectors see Henrich and Boyd 2001 and Axelrod 1986. On multi-level selection see Sober and Wilson 1998. On “No Play” option see Fowler 2005. On spatial structure see Brandt et al. 2003.
In the group selection model the key difference between cooperation and altruistic punishment becomes apparent, rendering AP as an evolutionary robust social norm. The crux of the group selection argument is that a group of cooperators will be more successful than a group of defectors in direct or indirect competition for resources. In this respect, each individual has an incentive to cooperate to make his group more successful than other groups. On the other hand, each individual also has an incentive to defect to increase his payoff within a group. It is possible to show that in the model of group selection cooperation can survive if several very strict conditions are met such as significant fitness differential between the groups, limited migration and genetic drift, and extinction and formation of new groups. The evolution of cooperation under group selection is possible but not very probable since the specific conditions above must be met simultaneously. The main difficulty is that within each group cooperation is always strictly dominated by defection for all possible parameters of the model. And this is where altruistic punishment is different: altruistic punishment is frequency dependent—the greater the proportion of punishers, the stronger is the group in the intergroup competition and the less beneficial is defection within a group.
Altruistic punishment retains a between-group advantage of cooperation but manages to avoid its within-group disadvantage. A homogeneous group of altruistic punishers is as successful as a homogeneous group of cooperators in the inter-group competition. At the same time, the cooperative group is much more vulnerable to within-group defection. The robustness of altruistic punishment is directly proportional to its frequency. An increase in the proportion of altruistic punishers increases the fitness of such individuals and decreases the fitness of defectors. This is different from cooperation which is frequency independent within each group where it is always dominated by defection. Thus, the groups with the higher proportions of altruistic punishers have an advantage in the between-group competition for resources and are more capable of suppressing within-group defection.
Empirical Robustness of Altruistic Punishment
The field evidence shows that altruistic punishment often underlies a solution to the problem of overexploitation of common-property resources, especially in cases when privatization and governmental control are problematic. One form of altruistic punishment is a widespread custom of “self-help”, or self-enforcement of community rules by local means.36
Ellickson 1991.
Ellickson 1991, 57.
Anderson 1995.
Tang 1992.
Norman and Trachtman 2005, 545.
Four conditions appear to be necessary (but not sufficient) for altruistic punishment to be a communal norm: (1) pre-commitment to punish free-riders, (2) successful self-monitoring, (3) a common-property regime as opposed to open access, and (4) favorable legal constrains on self-enforcement.
Pre-commitment to punish defectors
Pre-commitment to punish defectors makes altruistic punishment a deterrence mechanism, not a cycle of violence. The modified Prisoners' Dilemma game that we discuss here assumes that altruistic punishment is a type along with cooperation and defection. In terms of evolutionary psychology this is a disposition triggered by appropriate environmental stimuli, notably, existence of defection. By construction, if a player's type is AP, then this player is pre-committed to punish defectors. An alternative model is a two-stage game in which players first choose whether to cooperate or defect and then choose whether to punish or not.42
In this case, the game is similar to the well-known chain-store paradox, in which case pre-commitment is also important and economically rational if the game is repeated.43Kreps and Wilson 1982.
One class of such mechanisms is psychological adaptations, such as emotions.44
Tooby and Cosmides 1992.
Dawes et al. 2007.
Fowler, Johnson, and Smirnov 2005.
Cultural adaptations are another class of effective mechanisms allowing pre-commitment. Communities around the world have moral codes, norms, and customs that encourage not only cooperation but also punishment of unfair behavior. In fact, such practice may take an extreme form: both cooperator and defector become the object of social ostracism if the cooperator fails to punish his offender. In terms of the model, it means that cooperators who face defectors are also punished for not being altruistic punishers.48
Cf. Boyd and Richerson 1992.
Effective self-monitoring
Effective self-monitoring of the commons is also necessary since undetected cheating makes altruistic punishment irrelevant. In terms of the model, the probability of getting away with defection is functionally the same as adding a weight—less than one—to the cost of punishment. In the environment where defection is completely unobservable, altruistic punishment becomes identical to cooperation, and we return to the classical version of the Prisoners' Dilemma game. Trivers, for example, argued that defectors could benefit from “subtle cheating” helping them avoid repercussions for their behavior; in response, however, humans have evolved cheater-detection cognitive adaptations.49
Trivers 1971.
This argument is supported by research in evolutionary psychology.50
Cheater-detection cognitive adaptations indicate that humans are good at spotting free-riding. In particular, experiments on Wason selection task demonstrate that humans are capable of making complex logical inferences in a social environment where cheating is possible. Such cognitive adaptations, however, are only effective for small groups of individuals (less than 100) and in the absence of any sophisticated tools that can be used for cheating. These psychological mechanisms are believed to have evolved during the environment of evolutionary adaptedness (Pleistocene), or 99.9% of the human evolutionary development. In the contemporary environment, different from the Pleistocene and often characterized by big group sizes and new technologies, cheater-detection adaptations may be ineffective. Examination of the collective action problem through the prism of group size is similar to Mancur Olson's classic analysis but the underlying economic and evolutionary arguments are different.51Olson 1965.
In response to the natural limits of cognitive cheater-detection, communities around the world have evolved sophisticated monitoring mechanisms designed to prevent cheating.52
Although new technologies make it easier to cheat, they also make it easier to monitor and detect the violation of rules.Common-property regime vs. open access
Existence of such communal rules is possible and appropriate for common-property regimes as opposed to open access.53
Ciriacy-Wantrup and Bishop 1975.
Stevenson 1991.
Legal constraints
Legal constraints typically promote cooperation through the state-based law enforcement. Under many circumstances state intervention proves to be necessary and beneficial. Interestingly, it may also have a surprisingly negative effect on commons management especially in less-developed countries.56
Max Weber argued that the modern state seeks “to monopolize the legitimate use of force.”57Weber 1958.
Frohlich and Oppenheimer 1995. I am grateful to John Orbell for bringing this to my attention.
If the state alternatives to self-enforcement are not effective, the cooperative equilibrium is in danger. Field studies document how centralized intervention can fail to preserve local customs and lead to the tragedy of the commons.59
In the Sudan as well as some other developing countries, the state rather than communal ownership is seen as the “major cause of inappropriate land use practices and consequently depletion of [common-pool resources].”60Kibreab 2002, 403.
voluntary private associations for sharing the cost of a common good—policing—were subsequently undermined by statehood, and the publicly financed local sheriff as the recognized monopoly law enforcement officer. This observation contradicts the myth that a central function of government is to “solve” the free-rider problem in the private provision of public goods.61
Smith 2003, n. 52.
Breaking the communal self-enforcement rules and norms could entail risk whereas re-creation of self-governing commons is a difficult task.62
In short, available empirical evidence suggests that pre-commitment, successful monitoring, common-property regime, and a certain degree of independence are necessary attributes of the successful self-enforcement of cooperation in the commons. An interesting link between the two factors was suggested by Elinor Ostrom, who reports a large number of empirical cases showing that “overexploitation of common-pool resources occurred when open access prevailed either because no set of individuals had property rights or because state property was treated as open-access property.”63
Ostrom 1992, 312.
Conclusion
It has been a folk theorem that punishment can sustain cooperation. In the traditional rational choice models of iterated play, punishment means retaliatory defection. A possibility of continuing cooperation and punishment by other means has been largely ignored by the general political science audience, and by policy makers, because costly punishment can be seen as a public good itself, thus, subject to the same problem of individual free-riding. Nevertheless, experimental evidence and field studies suggest that altruistic punishment is common. Individuals are willing to incur a cost in order to punish defectors. Theoretical examinations of the phenomenon can be found in various disciplines: political science, economics, biology, computer science, psychology, and others. The overlap between the study of public goods in the social sciences and altruistic behavior in the life sciences suggests altruistic punishment as a possible solution to the tragedy of the commons. Costly self-enforcement of cooperation appears to be a part of human psychological apparatus as well as communal rules and norms. AP is rational in economic terms if precommitment is possible. It is also robust in evolutionary terms, especially in comparison with non-punishing cooperation. Unlike cooperation, altruistic punishment is frequency dependent: the higher the proportion of altruistic punishers in the population the greater is their fitness. Although altruistic punishment is not asymptotically stable within a single group, the problem can be offset by a within-group conformist pressure or between-group competition. In the context of group selection, altruistic punishment retains the strength of cooperation in between-group competition and, at the same time, prevents defection from taking over within the group. In normative terms, altruistic punishment is a more appealing norm than punishment by defection. In the commons, Tit-for-Tat and other trigger strategies not only punish defectors but also harm individuals who cooperate, which may lead to a new wave of defections. Altruistic punishers always cooperate and punish only those who deserve it. Interestingly, a by-product of altruistic punishment is improved group equality since defectors also happen to be the highest earners in public good games.
A possibility of altruistic punishment as a solution to the tragedy of the commons has immediate policy implications. If cheating can be prevented, if there is no emergency, and if the commons are characterized by the common-property regime, the community may be successful without external assistance such as state intervention. On the other hand, if monitoring is costly or impossible, if the commons are characterized by the open access regime, and if external factors such natural cataclysms are present, external assistance may be necessary. Failure to differentiate between the two cases as well as failure to recognize the importance of self-enforcement of cooperation as a community norm will only accelerate the opportunistic behavior and overexploitation of available resources. The tragedy of the commons often happens when individuals start treating their common-property resources as open access property. Recognizing altruistic punishment as a vital attribute of the common-property management may explain some of the counter-productive policies. When the state monopolizes the use of force it automatically undermines the mechanisms of internal enforcement of cooperation. External enforcement may or may not solve the free-rider problem, but it is certain to bring the commons one step closer to being an open access regime by taking away the social norm of altruistic punishment.
This argument implies that to solve the tragedy of the commons, policy-makers should not necessarily get rid of the commons by means of privatization or centralized coercion. To the contrary, to prevent the tragedy, policy-makers may want to re-create and reinforce the commons as a common-property regime with a certain degree of sovereignty, characterized by its customs and norms. This conclusion may be potentially relevant not only to the governance of common-property regimes, but more broadly to the general issue of evolved local institutions and government control.