Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-02-06T20:18:47.318Z Has data issue: false hasContentIssue false

In the lab and the field: Punishment is rare in equilibrium

Published online by Cambridge University Press:  31 January 2012

Simon Gächter
Affiliation:
School of Economics, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom. simon.gaechter@nottingham.ac.ukhttp://www.nottingham.ac.uk/Economics/people/simon.gaechter

Abstract

I argue that field (experimental) studies on (costly) peer punishment in social dilemmas face the problem that in equilibrium punishment will be rare and therefore may be hard to observe in the field. I also argue that the behavioral logic uncovered by lab experiments is not fundamentally different from the behavioral logic of cooperation in the field.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2012

Francesco Guala's target article is a valuable contribution to the discussion about the empirical importance of weak and strong reciprocity. In this commentary I focus on one aspect of his call for more field (experimental) evidence on strong negative reciprocity in social dilemmas. I argue that collecting such evidence is welcome but faces the difficulty that theory predicts, and experiments confirm, that punishment will be rare in equilibrium. I also argue that from the equilibrium perspective the behavioral logic uncovered by lab experiments is not fundamentally different from the behavioral logic of cooperation in field situations. Therefore, the distinction between “narrow” and “wide” readings of strong reciprocity and the preoccupation with external validity concerns is somewhat artificial.

I agree with Guala that experiments are a good tool to measure motivations. Previous experimental evidence (surveyed in, e.g., Chaudhuri Reference Chaudhuri2011; Gächter & Herrmann Reference Gächter and Herrmann2009) shows that many people are willing to incur costs to punish freeloaders. People punish in finitely repeated games played by the same set of people (e.g., Fehr & Gächter Reference Fehr and Gächter2000a), in repeated one-shot experiments where people interact with new group members each time (e.g., Fehr & Gächter Reference Fehr and Gächter2002), and even in single-shot experiments where group members interact exactly once (e.g., Cubitt et al. Reference Cubitt, Drouvelis and Gächter2011. The main purpose of these experiments was to probe whether and under what conditions people are willing to incur costs to punish and whether this influences cooperation levels. For this purpose and for various historical (and logistical) reasons, almost all experiments implemented at most ten rounds of interaction. Although short experiments can show that punishment exists and can have powerful behavioral consequences, these kinds of experiments might not be long enough to allow equilibration to be observed. In equilibrium people will have shared expectations about how others will behave and will adopt their behavior accordingly; punishment should be rare. Short experiments make an equilibrium perspective difficult and may therefore “overstate” the frequency of punishment.

An equilibrium perspective is important, however, if one is interested in field (experimental) evidence on punishment. In many field settings, more or less stable groups of people will interact, and/or people will have more or less settled expectations (through own observation and experience, as well as through social learning) about how others will behave even in one-shot settings. In terms of observing punishment for freeloading in social dilemmas, theory predicts that in equilibrium punishment will be rare, even if some people are prepared to punish freeloaders. In the presence of punishers, freeloaders have an incentive to contribute to the public good to avoid punishment. Thus, if punishment is behaviorally effective and there is no antisocial punishment of cooperators (Herrmann et al. Reference Herrmann, Thöni and Gächter2008), punishment will be rarely used because there will be only few selfish transgressions (see also Boyd et al. Reference Boyd, Gintis, Bowles and Richerson2003). As a consequence, the costs of punishment can be low.

The experimental evidence reported in Gächter et al. (Reference Gächter, Renner and Sefton2008) supports this reasoning. There are two conditions in their experiment: one without punishment, and one with punishment. In both conditions, groups of three people each receive an endowment of 20 tokens which they can contribute to a public good or keep for themselves. Payoffs are such that people have an incentive to contribute nothing to the public good, although full contribution is the socially optimal decision. In the experiment without punishment, a round ends after everyone has made his or her contribution decision. In the experiment with punishment, a second stage is added where group members are informed of each others' contribution and then can decide to incur their own costs to reduce each others' earnings from the first stage by three money units. Punished group members are only informed of the sum of received punishment, and not about who punished them (in this sense punishment is “coordinated”). Group members interact for 50 periods, which should give plenty of time to allow for equilibration. Seventeen groups each participated in the two conditions (between subjects). Figure 1 reports the most important result for the purpose of this comment.

Figure 1. Average contributions to the public good of 17 three-person groups each across 50 rounds in the punishment condition (P) and the no punishment condition (N). Both are measured on the left axis. The dashed line depicts the frequency of punishment acts (measured on the right axis). The inlet figure illustrates contributions and punishment frequency of the median group with regard to cooperation level. Data are taken from Gächter et al. (Reference Gächter, Renner and Sefton2008); analysis and illustration are my own.

In the absence of punishment (labeled “Contributions N”), contributions start at 9.5 tokens on average and decline to 3.7 tokens by period 50. The average contribution in the second half of the experiment is 6.4 tokens. Adding the punishment opportunity has huge consequences on average contributions (labeled “Contributions P”). Contributions increase rapidly to 17.6 tokens on average in the second half and are significantly higher than in the first half (n=17, z=3.39, p=0.0007; Wilcoxon signed ranks test with group average contributions as independent observations).

Most importantly for my present purposes, the dotted line illustrates the average frequency of punishment acts across the 50 periods (measured on the right-hand axis). Because each group consists of three members, each subject in each period has two opportunities to punish other group members. Thus, in each period each group has six punishment opportunities. Because there were 17 groups, the total number of possible punishment acts was 17×6×50=5,100. Across the whole experiment we observe 493 acts of punishment (i.e., in 9.7% of all possible cases). Punishment was significantly more frequent in the first half than in the second half of the experiment (14.2 vs. 5.1%; n=17, z=3.29, p=0.001, Wilcoxon signed ranks test).

The inlet figure illustrates the median group with regard to contributions to the public good in the P experiment. This median group contributed 18.4 tokens on average and punished in 10% of all cases. Punishment occurred exclusively in the beginning of the experiment. From period 19 onwards not a single act of punishment was observed; and 100% of all contributions were maximal. Across all 17 groups punishment frequency and contribution level are significantly negatively correlated (Spearman rank, n=17, ρ=−0.75, p=0.0005). In the second half this correlation is ρ=−0.93, p=0.0000).

My preferred interpretation of these results in the present context is that the first part of the experiment is a phase where common expectations are established, and behavior has to be coordinated accordingly. Once behavior and expectations are coordinated, equilibration has occurred and punishment is only rarely needed. The initial phase of the experiment might be seen as “artificial,” but since the experiment is a novel environment for participants, this is an inevitable part and the lab analogue of social learning in the field. In the field, social learning and cultural transmission (learning from peers, teachers, parents) teach people what constitutes socially acceptable behavior and what gets punished (e.g., Henrich Reference Henrich2004). Once expectations are formed and behavior is adapted accordingly, the same behavioral logic holds in the lab and the field, despite obvious environmental and complexity differences between them. To criticize strong reciprocity theorists for their “narrow focus on artificial environments” (target article, sect. 15, para. 2) therefore seems to miss the point.

Think of queuing as an example. Orderly queues are socially optimal, but individual incentives are to jump the queue. Many queues are surprisingly orderly and few instances of punishment occur (counterexamples exist, of course). People learn through observation and education how to behave in a queue. Chances are that a queue jumper will be told off and sent back to line (which is an example of peer punishment). Many potential queue jumpers will think about this possibility and refrain from jumping the queue, making punishment a rare event. Thus, the behavioral logic of how to behave in a queue is the same as in the lab experiment. Sometimes queue jumpers go unpunished and queues break down, like in the lab where cooperation sometimes also works badly, despite, or because of, punishment (Herrmann et al. Reference Herrmann, Thöni and Gächter2008).

The relevance of this example extends beyond queuing. Peer punishment as modeled in these experiments can be seen as expressions of social disapproval (Carpenter & Seki Reference Carpenter and Seki2011; Masclet et al. Reference Masclet, Noussair, Tucker and Villeval2003), which are ubiquitous in social life (think of ridiculing, gossiping, reprimands, social exclusion, etc.). Disapproval will often be costly for the sanctioned individual and in many cases will also be costly to the punisher, in terms of psychic costs (at least some people find it difficult to confront wrongdoers), foregone opportunities, and possible retribution. Therefore, one should not interpret these experiments too narrowly in terms of direct material costs alone. Modeling punishment in material terms is primarily done to control for individual incentives and to allow for exact theoretical predictions (Smith Reference Smith1982). The behavioral logic of meting out and avoiding punishment in the lab is similar to disapproving of some people's behavior and avoiding disapproval. Peer punishment experiments should therefore be seen at least as much as models of social control or moralistic aggression than of direct material punishment.

The experiment also suggests that the lack of observing punishment in field contexts cannot be taken as evidence for the irrelevance of punishment and as a sort of lab artifact. The experiment shows that even occasional punishment can have a huge impact on pro-social behavior as compared to a situation where people know for sure that they can get away with selfish behavior. Further experiments that model other important aspects of reality, like the possibility of communication (e.g., Bochet et al. Reference Bochet, Page and Putterman2006), coordinated punishment (e.g., Casari & Luini Reference Casari and Luini2009), third-party punishment (Fehr & Fischbacher Reference Fehr and Fischbacher2004), assortative matching (e.g., Gächter & Thöni Reference Gächter and Thöni2005), or the simultaneous presence of rewarding strategies (e.g., Rockenbach & Milinski Reference Rockenbach and Milinski2006; Ule et al. Reference Ule, Schram, Riedl and Cason2009), also suggest that punishment will be rare in equilibrium and nevertheless have an important behavioral impact.

Finally, the equilibrium perspective suggests it is at least as important to focus on evidence about the punishment people expect would they transgress, rather than actual punishment, as well as on institutions like monitoring that might result in punishment (Rustagi et al. Reference Rustagi, Engel and Kosfeld2010).

References

Bochet, O., Page, T. & Putterman, L. (2006) Communication and punishment in voluntary contribution experiments. Journal of Economic Behavior & Organization 60(1):1126.CrossRefGoogle Scholar
Boyd, R., Gintis, H., Bowles, S. & Richerson, P. (2003) The evolution of altruistic punishment. Proceedings of the National Academy of Sciences USA 100(6):3531–35. Available at: http://www.pnas.org/content/100/6/3531.CrossRefGoogle ScholarPubMed
Carpenter, J. & Seki, E. (2011) Do social preferences increase productivity? Field experimental evidence from fishermen in Toyama Bay. Economic Inquiry 49(2):612–30.CrossRefGoogle Scholar
Casari, M. & Luini, L. (2009) Cooperation under alternative punishment institutions: An experiment. Journal of Economic Behavior and Organization 71(2):273–82. Available at: http://dx.doi.org/10.1016/j.jebo.2009.03.022.CrossRefGoogle Scholar
Chaudhuri, A. (2011) Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature. Experimental Economics 14(1):4783.CrossRefGoogle Scholar
Cubitt, R., Drouvelis, M. & Gächter, S. (2011) Framing and free riding: Emotional responses and punishment in social dilemma games. Experimental Economics 14(2):254272.CrossRefGoogle Scholar
Fehr, E. & Fischbacher, U. (2004) Third-party punishment and social norms. Evolution and Human Behavior 25(2):6387. Available at: http://linkinghub.elsevier.com/retrieve/pii/S1090513804000054.CrossRefGoogle Scholar
Fehr, E. & Gächter, S. (2000a) Cooperation and punishment in public goods experiments. American Economic Review 90(4):980–94. Available at: http://www.jstor.org/stable/117319.CrossRefGoogle Scholar
Fehr, E. & Gächter, S. (2002) Altruistic punishment in humans. Nature 415(6868):137–40. Available at: http://www.nature.com/nature/journal/v415/n6868/abs/415137.CrossRefGoogle ScholarPubMed
Gächter, S. & Herrmann, B. (2009) Reciprocity, culture and human cooperation: Previous insights and a new cross-cultural experiment. Philosophical Transactions of the Royal Society B: Biological Sciences 364(1518):791806. Available at: http://rstb.royalsocietypublishing.org/content/364/1518/791.abstract.CrossRefGoogle Scholar
Gächter, S., Renner, E. & Sefton, M. (2008) The long-run benefits of punishment. Science 322(5907):1510. Available at: http://www.sciencemag.org/content/322/5907/1510 CrossRefGoogle ScholarPubMed
Gächter, S. & Thöni, C. (2005) Social learning and voluntary cooperation among like-minded people. Journal of the European Economic Association 3(2–3):303–14.CrossRefGoogle Scholar
Henrich, J. (2004) Cultural group selection, coevolutionary processes and large-scale cooperation. Journal of Economic Behavior and Organization 53(1):335.CrossRefGoogle Scholar
Herrmann, B., Thöni, C. & Gächter, S. (2008) Antisocial punishment across societies. Science 319(5868):1362–67. Available at: http://www.sciencemag.org/cgi/content/abstract/sci;319/5868/1362 CrossRefGoogle ScholarPubMed
Masclet, D., Noussair, C., Tucker, S. & Villeval, M.-C. (2003) Monetary and nonmonetary punishment in the voluntary contributions mechanism. American Economic Review 93(1):366–80. Available at: http://www.jstor.org/stable/3132181.CrossRefGoogle Scholar
Rockenbach, B. & Milinski, M. (2006) The efficient interaction of indirect reciprocity and costly punishment. Nature 444(7120):718–23. Available at: http://www.nature.com/nature/journal/vaop/ncurrent/full/nature05229.CrossRefGoogle ScholarPubMed
Rustagi, D., Engel, S. & Kosfeld, M. (2010) Conditional cooperation and costly monitoring explain success in forest commons management. Science 330(6006):961–65. Available at: http://www.sciencemag.org/content/330/6006/961 CrossRefGoogle ScholarPubMed
Smith, V. L. (1982) Microeconomic systems as an experimental science. American Economic Review 72(5):923–55.Google Scholar
Ule, A., Schram, A., Riedl, A. & Cason, T. N. (2009) Indirect punishment and generosity towards strangers. Science 326(5960):1701–704. Available at: http://www.sciencemag.org/cgi/content/abstract/326/5960/1701.CrossRefGoogle Scholar
Figure 0

Figure 1. Average contributions to the public good of 17 three-person groups each across 50 rounds in the punishment condition (P) and the no punishment condition (N). Both are measured on the left axis. The dashed line depicts the frequency of punishment acts (measured on the right axis). The inlet figure illustrates contributions and punishment frequency of the median group with regard to cooperation level. Data are taken from Gächter et al. (2008); analysis and illustration are my own.