“Crazy people are not crazy if one accepts their reasoning.”
Gabriel Garcia Marquez, On Love and Other Demons
Rational choice theory with its well-behaved, self-interested, context-independent preferences provides the ex ante theoretical framework that underpins classic welfare economics, and by extension benefit-cost analysis (BCA). Rational choice is useful to help make predictions about which policy option is likely to imply welfare gains or losses. But over the last two decades, behavioral economics – using psychological insight to sharpen economic principles – has emerged to challenge the rational underpinning of welfare/BCA economics (see Robinson & Hammitt, Reference Robinson and Hammitt2011; Viscusi & Gayer, Reference Viscusi and Gayer2015). People have imperfect rationality, social preferences, and present bias; people are averse to many things: ambiguity, loss, inequality, guilt, lying, disappointment, betrayal, complexity, regret, choice, innovation, envy, and so on (see e.g., the surveys by McFadden, Reference McFadden1999; Metcalfe & Dolan, Reference Metcalfe and Dolan2012). From these observations, numerous models with nonstandard preferences now exist to help make predictions about welfare/BCA gains. But given so many options, one finds it hard to decide which nonstandard preference model, if any, should replace rational choice theory as the predictive guide to welfare/BCA. In his recent Ely Lecture, Chetty (Reference Chetty2015, p. 29) makes a similar point: “One of the challenges practitioners face in incorporating behavioral insights is that there are myriad factors to consider, with little guidance about which factors are most important.”
We accept that our straw man – the search for a unique behavioral benchmark, is not explicit in the literature. But it is implicit. Going through the literature one reads one model after another that typically introduces one aversion or bias at a time. Most papers on ambiguity aversion do not include guilt aversion, and vice versa. Most papers on lying aversion do not include present bias, and vice versa. Most papers on fanning out do not include social preferences, and vice versa. All these nonstandard preferences matter, but they likely do not exist in isolation from each other – they are modeled separately for analytical tractability, which makes sense. Modeling with austerity is a necessary choice. But this modeling strategy begs the question for pragmatic BCA, which nonstandard preferences emerge under which exchange conditions? And why? That known unknown is what we are trying to account for in this commentary on how to integrate behavioral economics into the theoretical underpinnings that guide BCA.
Rather, we make the case that the new benchmark will likely have to be an interval of “reasonable” outcomes conditioned on the likely set of exchange institutions that underpin choice and policy. We are making a distinction now between past outcomes, which can be studied, which reflect the combined effects of rational and irrational choices – and the task of predicting future outcomes, which are unknown and depend on the institutional context in which the choice is made. This ex ante theoretical interval would be defined by alternative sets of exchange institutions (markets, missing markets, no markets) that provide people incentives either to act more or less rationally, thereby creating a behavioral confidence range around a “reasonable” upper benchmark. We do not flesh out all details on how to define this new interval behavioral benchmark – rather we make a case for why we think this could be a useful path forward. Rather than attempting to find “the unique” behavioral benchmark given sensitive context-dependent preferences, let us define institutional rules which can create theoretical points of reference that allow us to gauge welfare gains and losses within BCA. The idea is we have a benefit interval against which one compares costs. If the interval exceeds costs, proceed; if not, stop. If the interval defining welfare gains straddles the costs, we have lost nothing – we return to the political process of decision making.
1 Benchmark behavior
We now make our case that we should explore the path of a theoretical interval behavioral benchmark. Economics uses benefit-cost analysis (BCA) to help sharpen public policy. The goal is to provide a logical framework to create and organize cogent data to help policymakers make decisions that yield more benefits at less cost for social welfare programs, health, education, transportation, and environmental protection. Rational choice theory has been the logical analytical framework underpinning BCA – calm, logical, smart, backward inducting, and forward looking. Economists use rational choice to frame decisions and to measure the consequences of alternative policy options and incentive schemes.
The challenge, however, is that rationality in economics is a social construct based on active market exchange, not an individual construct based on isolated introspection (Arrow, Reference Arrow, Hogarth and Reder1987; Becker, Reference Becker2002). Using rational BCA principles to guide policy is problematic when environmental goods and services lack market exchange to encourage calm and consistent choices (see e.g., Crocker, Shogren & Turner, Reference Crocker, Shogren and Turner1998). Nonstandard preferences can now play a role in choice – emotions, myopia, gaffes, and social preferences now matter, especially if no money pumps exist to punish inconsistent choice. The lack of an active exchange institution helps promote the schism between the nature of the model and the world of nature (see Kahneman & Tversky, Reference Kahneman and Tversky2000).Footnote 2
Behavioral economics and alternative models of choice have emerged as one path to help bridge this gulf. Behavioral economics has a role in the BCA of policy by applying psychological insight to reshape economic principles. Behavioral economics adds humanity to choice theory (see e.g., Thaler & Sunstein, Reference Thaler and Sunstein2008). To run parallel with the familiar idea of market failure (i.e., nonrivalry, non-excludability, asymmetric information), Shogren and Taylor (Reference Shogren and Taylor2008) lump the crowd of deviations from rationality into a catch-all term behavioral failure. A behavioral failure implies that a person fails to behave as predicted by rational choice theory.
Our definition of rational choice has been purposefully selected to be narrow – rational selfishness. We used this definition as the “straw man” because this mindset has underpinned the vast majority of work within applied BCA. Over the years we have used this straw man as a way to highlight and stress to our colleagues the need to spend more time focusing on the behavioral underpinnings of policy relative to the data-driven econometric exercises based on rational selfishness that dominate our field. Many behavior puzzles can be “explained” with extra ancillary assumptions about what people value without giving up consistent choice. We think this approach, though, helps us make our case a bit stronger because it does point out that one can add degrees of freedom to capture behavior, but that in some cases, an extra degree of freedom captures more than just the behavior you are trying to explain. We still struggle to understand, for example, the exact behavioral difference – if any – between guilt aversion and lying aversion. Both add another degree of freedom to get at a similar underlying emotive feeling, but is one aversion strictly separable from the other aversion? Answering this as economists asks us to move deeper into the realm of psychology, and to address explicitly and directly the interaction of emotions (e.g., can economists define the cross-partial derivative between guilt and lying aversion?).
2 A pragmatic behavioralist
To put our work in context, consider Chetty’s case on how adding behavioral economics into welfare should be based on pragmatism – what actually works to improve welfare in practice, not on philosophical choice – what might work in theory. Chetty then offers two pragmatic notions as to how one might better integrate behavioral economics into BCA and welfare economics: develop new measurement tools (elicit subjective well-being with public opinion surveys or estimate sufficient statistics based on revealed preferences), and produce new theory (build new structural models).
At first glance, one might find it hard to disagree with either point. Better measurement and better theory have always been the goal. But for better or worse, Chetty’s pragmatic solutions cover a lot of the same muddy ground that nonmarket valuation work has trekked over for the last four decades. While fresh eyes re-examining old problems are always welcome, it is also useful to put his observations in context. Research exploring the integration of economics and the environmental has invested substantial intellectual effort trying to understand how to reconcile observed behavior with theory and what any gap might mean for the BCA of public policy. And while suggesting that surveys, market data, and structural models offer pragmatic solutions is appropriate, it is worth understanding a bit more about what has happened the last four decades and where the literature currently sits regarding behavioral economics and BCA.
First, developing new measurement tools to elicit benefits is worthwhile, and has been an ongoing process for decades. In fact, eliciting subjective measures of well-being, happiness, preferences, or benefits using nonmarket valuation methods generated the earliest arguments for and against adding psychological insight into BCA for environmental policy (see for instance the survey in Shogren, Reference Shogren2005). Valuing the benefits of environmental goods and services using stated preference and revealed preference methods have long rested on rational choice theory as the analytical framework. But as the contingent valuation (CV) debates of the 1970s, 80s, and 90s revealed, behavioral economics secured itself a spot in the discussion. For example, in the state of the art CV survey by Cummings, Brookshire and Schulze (Reference Cummings, Brookshire and Schulze1986), Daniel Kahneman warned how ignoring behavioral regularities would bias benefit estimates – like when people use CV bidding to signal generic or surrogate preferences for environmental protection rather than preferences for the specific good in question.Footnote 3 Jack Knetsch (Reference Knetsch, Dragun and Jakobsson1997, p. 209) has also long stressed that ignoring behavioral economics: “in view of the evidence, the seemingly quite deliberate avoidance of any accounting of these [behavioral] findings in the design of environmental policy or in debates over environmental values, does not appear to be the most productive means to improvement.”
The last three decades have witnessed numerous debates over Kahneman and Knetsch’s opinion that one cannot reconcile the idea of rational preferences with the psychological realities observed in the data (e.g., willingness to pay [WTP] versus willingness to accept [WTA], preference reversals, surrogate bidding, anchoring).Footnote 4 See, for example, the survey by Carlsson (Reference Carlsson2010), who explores the importance of understanding the impact of constructed preferences, context dependence, and hypothetical bias on stated preference methods. If revisiting the idea that stated preference methods still serve as the pragmatic guide, what insight does behavioral economics have to offer that has not been explored in this large nonmarket valuation literature?Footnote 5 The question of preference stability, standard or nonstandard, matters for theory and public policy because if preferences are “transient artifacts” contingent on context, so are the welfare measures used in BCAs to rationalize or reject regulations to protect health and safety. If preferences and decisions are context-dependent, our measures of benefits and costs will be context-dependent, which implies policy is also context-dependent (see Slovic, Reference Slovic1991; Tversky & Simonson, Reference Tversky and Simonson1993; Rabin, Reference Rabin1998). The lack of preference stability goes against the notion that economics can establish some rational stable benchmark to judge policy success and failure.
One path is to find a mechanism that can reconcile decision utility (as defined by rational choice theory) with experienced utility (as promoted by behavioral economics, see Chetty, Reference Chetty2015). If decision utility and experienced utility are identical, rational choice theory is sufficient to model behavior (e.g., Cherry, Crocker & Shogren, Reference Cherry, Crocker and Shogren2003). Coming from the social psychology literature, one behavioral economic idea is commitment theory (see Joule, Girandola & Bernard, Reference Joule, Girandola and Bernard2007). Commitment theory rests on the premise that we can create real economic commitment in a nonmarket choice through an instrument like a solemn oath. The oath-as-commitment device might be such a path to get people to match up decisions with experience in public opinion surveys – we do not know, but we think our initial experimental work has some potential. Jacquemet et al. (Reference Jacquemet, Joule, Luchini and Shogren2013, Reference Jacquemet, James, Luchini and Shogren2016) explore whether an oath can improve demand revelation used in BCA. They find that the oath works. They ask whether people who take an oath to tell the truth bid more sincerely in an incentive compatible auction. This question arises because experimental evidence has provided weak support for sincere bidding at the individual level in demand-revealing auctions. They find that the oath induced better demand revelation in a second-price auction with and without monetary incentives. They next ask whether the oath will improve stated preference methods used to elicit WTP measures of values for nonmarket goods. This question arises because stated preference methods have never shaken the criticism of hypothetical bias – a person typically promises more than he or she can deliver. The gap in intentions and actions arises because either the hypothetical context violates the budget constraint inducing people to bid too high or because the context of real bidding violates the participant constraint causing people to bid too low to opt out of the auction, or potentially both. Jacquemet et al. find that the oath can work to get people to think seriously about both the budget constraint and the participation constraint – which is suggestive that it helps a person better align his or her decision and experienced utility. This point is speculative, however, and more data are required to establish this idea.
Second, if the commitment mechanisms cannot align decision and experienced utility, then we have to explain the gap with a new structural model. So we agree with Chetty that constructing new structural models to help economists frame choices is needed. This new structural model will also define new benchmarks in BCA analysis. But our basic question remains – what is the new behavioral benchmark? Is there one new benchmark against which we judge whether a policy is welfare enhancing or will there be a new benchmark for every context? The beautiful thing about BCA as defined by rational choice theory is a consistent benchmark across studies. If BCA expands to include the realities of bounded rationality, bounded self-interest, bounded willpower, and unbounded emotions into our measures of welfare, how should we define a new behavioral benchmark against which we can judge whether proposed environmental policy options are more or less efficient? If we can construct useful models that presume stable nonstandard preferences then policy based on BCA still can work. But that would mean rejecting the notion that many psychologists have advanced that preferences are fungible – they are more affected by noneconomic contextual cues than economists have acknowledged or admitted.
For example, the interface between behavioral economics and public policy can be scattered and fragmented when compared to the more monolithic neoclassical literature on revealed preferences. General lessons are hard to come by, given the context-specific nature of theories and observation within behavioral economics. Numerous psychological explanations can be used to explain the same phenomenon. For instance, behavioral economics points to inattentiveness to incentives, over-confidence in future earnings, and present biases toward current consumption – all three can be used to rationalize low savings rates (see Mullainathan, Schwartzstein & Congdon, Reference Mullainathan, Schwartzstein and Congdon2012). Smith and Moore (Reference Smith and Moore2010, p. 231), for instance, take even a stronger stance: “…the most carefully reasoned analytical arguments within the behavioral economics literature do not as yet have specific insights to offer for practical benefit-cost analysis” (also see Sugden, Reference Sugden2005).
In the environmental policy context, another example is energy efficiency and climate change risk. An “Energy Paradox” is said to exist when people buy less energy conservation than predicted by a present value calculation given, say, a tax on carbon. Several behavioral anomalies explain this result – people have a large discount rate, they have trouble calculating expected fuel savings, they lock into the status quo, and they rely on heuristic decision-making strategies rather than optimizing net benefits. But these competing models need to be tested within the same experimental design. They are a collection of ideas. The policymaker who relies on BCA does not know for certain which effect dominates choices of energy conservation, and why this effect(s) is the key for predicting welfare gains (see Gillingham, Newell & Palmer, Reference Gillingham, Newell and Palmer2009). Policy options in such cases are limited to more education, information, and standard setting. If people are not responding consistently to pricing changes, BCA-based policy will not have the intended consequences, either in efficiency or distribution of burden (Galle, Reference Galle2011).
3 An interval as a welfare guide rail
So where does this all leave us? We agree with Chetty in that BCA will always benefit from better measurement tools. The pragmatic search for more precise estimates of nonmarket preferences is still ongoing, and as much intellectual energy has been invested into this pursuit as any topic in environmental economics. Also the ever-present desire to better match theory, rational or otherwise, with observed behavior is just good science, even if the match implies we need context-dependent preferences. But what context matters most? For 200 years, economists focused a lot of attention on how people were averse to risk. Today, people are assumed to be averse to a long line of emotional adders: ambiguity, loss, inequality, guilt, lying, disappointment, betrayal, complexity, regret, choice, innovation, envy, and the list goes on. Understanding which aversion might dominate individual and aggregate behavior under what nonmarket conditions could prove to be an endless unanswered empirical question. Toss into the mix the observation that some people just do not want to reveal or know the truth about benefits or costs, either in regard to others or themselves. This strategic avoidance of information keeps the search to define a new unique behavioral benchmark a slippery task (see e.g., Thunström et al., Reference Thunström, van’t Veld, Shogren and Nordström2014, Reference Thunström, Nordström, Shogren, Ehmke and van’t Veld2016).
In our opinion, the challenges of ongoing measurement, the ever-shifting behavioral benchmark, and strategic self-ignorance return us full circle to a sensible point made by Peter Bohm. Over three decades ago, Bohm (Reference Bohm1979, Reference Bohm1984) proposed the interval method in BCA: stop trying to find the exact point estimate – rather design our models and measurement tools to allow for an interval of values based on institutional incentives that trigger certain behavior. Bohm’s interval method captures the notion that rational people or imperfectly rational people have a range of values that emerge due to strategic reasoning based on the institutional setting (e.g., free riding, conditional cooperation).
We present a case for an interval method following Chetty’s (2015) model that examines policy options given a person has both experienced and decision utility. Let $u(c)$ represent a person’s experienced utility in which $c$ represent a vector of choices (e.g., consumption). Recall the idea of experienced utility represents the person’s ex post realized well-being from the choices (i.e., happiness, see Kahneman & Sugden, Reference Kahneman and Sugden2005). Let $v(c)$ represent his or her decision utility, which represents the ex ante objective he or she is maximizing when choosing $c$. Now Chetty allows nonstandard preferences to enter into the ex ante decision utility by assuming that utility is conditioned by external “nudges” from policymakers, $n$, (e.g., opt-in versus opt-out defaults) and exogenous intrinsic factors, $d$, that cannot be manipulated by nudges, such that $v(c\mid n,d)$. These non-nudgeable factors can include nonstandard preferences such as altruism, bounded willpower, guilt aversion, inequality aversion, regret, and so on. Let $p$ represent the pretax price vector on choices, and let $Z$ represent the person’s income.
Here is how we differ from Chetty based on our experience observing the range of standard and nonstandard behavior within the laboratory and field. Based on experimental data, we now allow for a set of institutional exchange rules, $r$, exist – allocation rules, cost rules, sharing rules – to affect which intrinsic preferences come into play for a person, standard or nonstandard (e.g., Smith, Reference Smith2003). We assume the nature of intrinsic preferences are rule-dependent, such that $v(c\mid n,d,r)$. A person evaluates which set of standard/nonstandard preferences work best for the exchange system in which they operate. For example, if one sets up a winner-take-all tournament with nonlinear payoffs, many people easily rationalize assigning 100% of the weight to self-interest and 0% weight to altruism in this environment (see e.g., the cutthroat behavior generated in Shogren, Reference Shogren1997).Footnote 6 If the rules are a winner-take-tournament, $r_{1}$, then we might well assume the preferences are standard self-interest, $d_{1}={\it\phi}$. Now this would suggest that the nudge should be conditioned on $d_{1}$, not a generic $d$. In contrast, if the rules are a type of collective common pool sharing system, $r_{2}$, in which altruism or reciprocity play a key role to foster cooperation, the intrinsic factors differ, $d=d_{2}$. Also a gift exchange system, $r_{3}$, might define a third set of preferences, $d=d_{3}$. These rules differ from nudges in that the rules define the basic institutional exchange mechanism. These core rules differ from the more transient short-term nudges based on framing or information policy. Here we are capturing the idea that rationality is a social concept affected by the rules of the underlying exchange system. By choosing $r$, a planner affects the composition of $d$. The exact cause-and-effect relationship between $r$ and $d$ remains to be established, but we do know based on evidence that competitive well-defined exchange institutions can promote more standard behavior, whereas cooperative or missing-market institutions can promote more nonstandard behavior.
The planner’s problem is now to select the set of rules, nudges, and price incentives $(t)$ to maximize experienced utility subject to a revenue requirement, $R$, and an incentive compatibility constraint, in which decision utility is now rule-based, $v(c\mid n,d,r)$,
(1) $\text{Max}_{t,n,r}_{~s.t.}u(c)$;
(2) $t\cdot c=R$;
(3) $c=\text{argmax}_{c}\{v(c\mid n,d,r)s.t.(p+t)\cdot c=Z\}$.
As Chetty explains, the standard neoclassical model emerges if one imposes restrictions on the ancillary conditions such that [we add (7), implying that preferences are rule-independent]:
(4) $n={\it\phi}$
(5) $d={\it\phi}$
(6) $u=v$
(7) $r={\it\phi}$
Chetty then goes on to argue that one can understand the welfare implications of a policy by relaxing restrictions (4)–(6) [and implicitly (7)] and by framing the problem as a classic Pigovian externality problem. The planner’s new problem would be modified such that his goal is to maximize $u(c)+e(c)$, where $e(c)=u(c)$ – $v(c\mid n,d,r)$ is the “internality” created by the gap between experienced and decision utility.Footnote 7
In BCA, we might be using stated preference methods to measure the implied welfare changes represented by this gap between experienced and decision utility. Stated preference methods are similar to public opinion surveys, and can be used to measure both decision utility and experienced utility. Defining one theoretical benchmark internality $[e(c)]$ would require knowing or presuming the composition of preferences, $d$, and presuming this composition was independent of the underlying exchange rules. Under one exchange rule system, the internality gap and related policy welfare gains/losses might be small; under another system, the gap and gains/losses might be large. What we are suggesting is that rather than trying to guess/assume what the nature of preferences might be, $d$, for any given policy being evaluated by BCA, one can create an interval of benchmark preferences based on manipulating the rules of the exchange institution.
We are speculating: we are asking whether one can use the exchange rules to our advantage by generating preference benchmarks that would likely emerge given the rules. The ideal would be to construct an interval of internalities [$e(c_{1})$–$e(c_{2})$] based on exchange rules that were selected to induce more standard preferences $(d_{1})$ or more nonstandard preferences $(d_{2})$ in a predictable fashion. If the internality interval is trivial after inducing both standard and nonstandard preferences, the subsequent BCA estimate of welfare gains/losses is more robust. These BCA estimates are then more transferable to other situations as well, given they might differ by exchange institution (see the work on benefits transfer). If the gap is large, however, and implies potential significantly different estimates of gains/losses under different exchange conditions, it would be useful to know. It would be helpful to understand which institutional conditions exist or are likely to exist in the near future.
4 Concluding remarks
The array of behavioral failures suggests that such a valuation interval could also arise due to context-specific issues like uncertainty over preference for unfamiliar goods, decisions based on heuristics, or coherent arbitrariness (consistent behavior arising from an arbitrary starting point). In his most recent work, Bernheim (Reference Bernheim2014, Reference Bernheim2016) and Bernheim, Fradkin and Popov (Reference Bernheim, Fradkin and Popov2015) makes a similar point: one can fit the broader range of behavior into BCA and welfare economics without knowing the “true” theory of decision making, if one is willing to accept some ambiguity.Footnote 8 Bernheim’s (Reference Bernheim2016) approach allows for observed choices to be based on and driven by numerous underlying, otherwise confounding, behavioral factors: “Instead, we are free to explore the possibility that combinations of variables measuring different types of feelings (joy, satisfaction, anxiety, fear, etc.), or the trajectory of feelings over time, might turn out to yield better predictions.”
If we leave the theoretical safety of the rational choice benchmark for another nonstandard preference theoretical model we have minimal guidance on how to choose which one is best (also see Viscusi & Gayer, Reference Viscusi and Gayer2016). Without obvious theoretical guidance we look at the evidence which strongly suggests that observed behavior is affected by the exchange rules that underpin choice. The interval idea is that we use the exchange rules to guide which theoretical model best reflects the choice at hand. If we use Rule A (winner-take-all market exchange), we can expect that more standard preferences will emerge; if we use Rule B (gift exchange), we can expect that more nonstandard preferences will emerge. Both sets of preferences can be internally consistent, but they imply a bigger or smaller “internality” between the ex ante decision and the ex post experience. We are proposing that both the theories are used to define whether we should incorporate these rules of exchange in understanding what intrinsic preferences are likely to emerge in a stated preference survey, and to use these rules to help define the range of behavior (the interval) that is likely to emerge when estimated. If this entire range exceeds or falls short of the cost estimate in the BCA, we can be more confident in accepting or rejecting the policy. If the interval straddles the costs, we have lost nothing – now the policy returns back to the political arena for a tough decision.
Should we presume our benchmark will be interval of values that reflect whatever underlying bias and strategic behavior exists within the context at hand? One can measure the interval of preferences that emerge from such a confluence. Hanley, Kriström and Shogren (Reference Hanley, Kriström and Shogren2009) provide one example of the interval approach to estimate the value of beaches in Scotland, and Banerjee and Shogren (Reference Shogren2012) explore the efficiency of an incentive compatible second-price auction when people have interval values and can submit an interval bid. Similar to the idea of not trying to isolate and identify “use versus non-use values” in nonmarket valuation, one does not try to identify and isolate each behavioral bias within each context. Again, one estimates a benefit interval against which one compares costs. If the interval exceeds costs, proceed; if not, do not. If the interval straddles costs, the policy falls back into the political arena, a context familiar to any practitioner of BCA. The interval method suggests that the answer is “no” to the question we pose in our title: we should not be searching for a new, unique behavioral benchmark for BCA, rather we need a better understanding of the range of values that emerges under alternative but unmeasured emotional interactions given institutional contexts and measurement tools.