1. Introduction
Scientific interest in willpower has grown in recent years.Footnote 1 It figured prominently in the Victorians' quest for social improvement, but waned during the early twentieth century – perhaps, partly because of its lack of precision. “Will” itself gets applied to at least three somewhat independent functions: the initiation of movement, which corresponds to the Cartesian connection of intention with action; the ownership of actions, which gives you the sense that they come from your true self (Wegner, Reference Wegner2002)Footnote 2; and the maintenance of resolve against shortsighted choices. When you will your hand to pick up a chocolate, will makes your hand move, and there is a “you” that feels like it's doing the willing, but you may also be failing to exert your will not to eat chocolate (discussed in Ainslie, Reference Ainslie2004).
The third usage becomes more specific if converted to “willpower,” but it still means different things to different authors. Internal self-control has been described in many ways over the years. The topic has many clinical implications, so it is often discussed by authors who are not concerned with motivational bookkeeping.Footnote 3 However, in a model where choice is determined by the competition of internal interests that depend on prospective reward, the possibilities for how one interest survives against more strongly motivated competitors are limited. They can be illustrated by the analogous problem of how one interest in a legislature can keep an unpopular measure from being voted down: It can tack the proposal onto a larger bill that is more popular; or, while it holds the floor, it can avoid recognizing opposing parties. In the news, we see legislators use either or both, and nothing else. Analogously, if we want to model willpower as a phenomenon within the competitive marketplace of reward, we have only two kinds of mechanisms. In keeping yourself from eating the chocolate, you can resolve not to eat it on the basis of larger incentives, and/or suppress urges to eat it to defend a current intention. The present author has always written about willpower as synonymous with resolve, but a great deal of different usage, as well as recent brain imaging, call for looking at how this mechanism co-exists with suppression.
Resolve is way of managing motivation to maintain the plan that seems best from a broad perspective in the face of expected temptations – options that might become dominant during future valuations. Because revaluations will inevitably occur, successful resolve must include means to maintain its motivational dominance over time. Suppression is a way of gating out alternatives to a current intention while ignoring their possible value; it is necessarily unstable. Consistent long-term choice depends on resolve, but in recent academic discussions on willpower resolve has often been replaced by, or confounded with, suppression. Loosely speaking, most philosophy (sect. 3.2.4) and the game-theoretic approach to reward theory (sect. 3.2.1) have equated willpower with resolve, whereas most experimental psychology (sect. 2.1), including brain imaging (sect. 4) has equated willpower with suppression. Economists have recently proposed theories using each model (sects. 2.1 and 3.2.2), and clinically-oriented social psychologists, although less systematic, have described elements of resolve (sect. 3.2.3). Recently, a third phenomenon, habit, has been proposed as a beneficial alternative to willpower (sect. 3.3). This article will propose how the motivational bases of these three processes determine their distinct and sometimes symbiotic operations.
Choices that evoke willpower typically compare options that pay off over different time courses, with poorer but faster paying ones weighed against the better but slower paying. In the laboratory, these options are usually offered as a smaller, sooner (SS) reward versus a larger, later (LL) one, with a fixed lag between the times when they are available and a variable delay before the SS reward. Preference for the fast-paying option is often temporary – only when the SS reward is close – the familiar phenomena of temptations, urges, or impulses, against which willpower is marshalled. Conversely, there are temptations to gain sooner relief from aversive experiences that will be worse if delayed, the net effect of which is the same as the choice between SS and LL rewards.Footnote 4 The consequences of impulsiveness may be trivial, as in preference for fast payouts in video games that reduce your score (Wittman, Lovero, Lane, & Paulus, Reference Wittman, Lovero, Lane and Paulus2010), in preoccupation with video games themselves (Griffiths, Reference Griffiths2008), or in everyday procrastination. But impulsive preference patterns are also evident in such consequential problems as drug addictions, bad health care decisions, unsafe sex, and failures to save for the future. Failures to prepare for the future may include participation in social decisions with shared impact, such as those about climate change (Gollier & Weitzman, Reference Gollier and Weitzman2010), population policy (Keiner, Reference Keiner2006), and social investment (Arrow, Reference Arrow and Sertel1999). Such problems have made impulse control a major topic in behavioral science, reflected in the many synonyms that imply one sub-agent within the person acting on another: self-control, self-regulation, self-command, self-denial, self-discipline, self-mastery, self-restraint, and self-government.
After reviewing the common explanations for how SS options tend to get chosen over LL options (sect. 2), this article will examine the mechanisms by which internal interests based on LL rewards have been proposed to counteract this tendency: suppression (sect. 3.1), the operational cost of which is often called effort (sect. 3.1.1); resolve (sect. 3.2), for which the mechanism of recursive self-prediction (sect. 3.2.1) has support in behavioral economics (sect. 3.2.2), social psychology (sect. 3.2.3), and philosophy (sect. 3.2.4), and is argued to be observable directly (sect. 3.2.5); and habit (sect. 3.3), in which the routine simplification of action (sect. 3.3.1) is distinguished from the outcome of resolve (sect. 3.3.2) and its failure (sect. 3.3.3). Suppression has at least one apparent correlate in brain activity, and it is argued that future research could show resolve by an inverse of this correlation (sect. 4). A concluding essay relates willpower to the evolutionary problem of achieving consistent choice as foresight increases (sect. 5).
2. Theories about impulses
Arguments persist about the nature of impulsive motives – why willpower is necessary to begin with. Impulsiveness was not contemplated in behavioral and economic models during most of the twentieth century, which depicted all organisms as naturally maximizing their expected utility (“expected utility theory” or “rational choice theory” – Posner, Reference Posner1998; Samuelson, Reference Samuelson1937; Sugden, Reference Sugden1991). It was evident even then that people tend not to do this. Furthermore, the frequent failure of education to produce consistent choice has argued for more than a wandering mind or weakness of intellect, but rather for a robust process of temptation by options that are preferred only temporarily.
Current theories of impulsiveness extrapolate from three kinds of experimental finding.
2.1 Visceral factors
Arousal of emotion or appetite increases preferences for SS rewards. For instance, sexual arousal changes self-reported preference not for just bad sexual choices (Ariely & Loewenstein, Reference Ariely and Loewenstein2006) but for SS money rewards as well (Van den Bergh, Dewitte, & Warlop, Reference Van den Bergh, Dewitte and Warlop2008), one of many “carryover” effects that have been reported (Lerner, Li, Valdesolo, & Kassam, Reference Lerner, Li, Valdesolo and Kassam2015; Luo, Ainslie, & Monterosso, Reference Luo, Ainslie and Monterosso2014). Such findings have led to general theories that arousal of emotions or appetites – “visceral” processes – is generally responsible for impulsive choices (Loewenstein, Reference Loewenstein1996; McClure, Laibson, Loewenstein, & Cohen, Reference McClure, Laibson, Loewenstein and Cohen2004). Because distinct brain centers are active during arousal of some appetites/emotions, their motivational effect is often proposed to be a separate, “hot” kind of reward that is discounted for delay faster than more rational, “cool” rewards,Footnote 5 making their evaluation “myopic” (Loewenstein, O'Donoghue, & Bhatia, Reference Loewenstein, O'Donoghue and Bhatia2015; Metcalfe & Mischel, Reference Metcalfe and Mischel1999; van den Bos & McClure, Reference van den Bos and McClure2013), and leading to temporary preference for them. Figure 1A depicts the values of an SS and alternative LL reward as the discounted sum of hot and cool values for each, but this depiction may be oversimplified. Data are lacking on the form of value discounting from hot versus cool rewards, including how the duration of the arousal affects them, and how a reward that depends on arousal is evaluated before the arousal happens; other models could account approximately for the arousal effect shown here. Furthermore, it is now clear that the single-choice comparisons depicted in Figures 1 and 2 are themselves oversimplifications – that people usually try out choices several times mentally before the prospective value of one reaches a threshold for action, a noisy process called drift diffusion (Pedersen, Frank, & Biele, Reference Pedersen, Frank and Biele2017). The figures should, therefore, be understood as a central tendency or median in such clusters of vicarious trials.
2.2 Hyperbolic delay discounting
Granting a role for visceral factors in some impulses, this category is still too narrow to account for all cases of temporary overvaluation of SS alternatives. This occurs in many situations where differential appetite is not a factor: variously because there is substantial delay before the SS as well as the LL outcome (Green, Myerson, & Macaux, Reference Green, Myerson and Macaux2005); where there is so little delay before the LL outcome that both occur during arousal (Wittman et al., Reference Wittman, Lovero, Lane and Paulus2010); or where arousal is not involved – as in simple procrastination (Ainslie, Reference Ainslie, Andreou and White2010a). Authors have sometimes noted that “near-term impulsivity can be expressed for monetary rewards at delays of several months” (McClure & Bickel, Reference McClure and Bickel2014, p. 67), thus recognizing such glacially slow impulsivity that visceral arousal is unlikely to be a factor.
Even without visceral factors, the shape of the discount curve describing how delay affects expected reward predicts a universal tendency toward impulsiveness. Experiments across species have found that the value of various rewards declines with delay in a hyperbolic curve (Green & Myerson, Reference Green and Myerson2013; Johnson & Bickel, Reference Johnson and Bickel2002; Kirby, Reference Kirby1997; Shapiro, Siller, & Kacelnik, Reference Shapiro, Siller and Kacelnik2008), even over tens of milliseconds (Haith, Reppert, & Shadmehr, Reference Haith, Reppert and Shadmehr2012); Wulff & van den Bos discuss alternative interpretations (Reference Wulff and van den Bos2018). In hyperbolic curves, the value of prospective events is plotted against their delay as an inverse proportion, with an impatience factor (k) in the denominator.Footnote 6 Hyperbolic discount curves describe the observed changes of preference from LL to SS rewards as the common delay before both options gets shorter (Fig. 1B).
The observation of hyperbolic delay discounting in nonhumans and children (Beran, Reference Beran2018, pp. 121–186; Green, Myerson, & Ostaszewski, Reference Green, Myerson and Ostaszewski1999; Scheres et al., Reference Scheres, Dijkstra, Ainslie, Balkan, Reynolds, Sonuga-Barke and Castellano2006; Steinberg et al., Reference Steinberg, Graham, O'Brien, Woolard, Cauffman and Banich2009) suggests that it is an inborn psychophysical tendency.Footnote 7 It is true that many people learn consistent financial planning, and with many experimental designs grown subjects do not report overvaluation of SS options. When investigators have focused on individual differences in adults' financial preferences, the reports of about half of subjects fit exponential (rational) discount curves better than hyperbolic ones (Harrison, Hofmeyr, Ross, & Swarthout, Reference Harrison, Hofmeyr, Ross and Swarthout2018; Hofmeyr et al., Reference Hofmeyr, Monterosso, Dean, Morales, Bilder, Sabb and London2017). However, these subjects presumably developed from children who discounted hyperbolically, arguably by developing compensatory techniques rather than by learning to modify directly the inborn mechanism of reward (discussed in Ainslie, Reference Ainslie2001, pp. 35–38).
2.3 Habit
According to folk psychology repeated choices in the same direction gather “force of habit” from repetition alone, and then require willpower if you want to change them. The reports of some addicts, for instance, that they no longer experience a choice about whether to go on consuming has led to speculation that drug “habits” are just that – “trenches … like the ruts carved by rainwater in the garden” through sheer repetition (Lewis, Reference Lewis2017). It has been suggested that drug addicts' particularly entrenched habits are caused not just by the well-recognized cumulative dopaminergic increase in the rewarding power of drugs (Holton & Berridge, Reference Holton, Berridge, Heather and Segal2013; Volkow et al., Reference Volkow, Wang, Fowler, Tomasi, Telang and Baler2010), but also by drug-induced damage to the brain mechanism that shifts between habitual and deliberate (“model-based”) choice (Ersche, Roiser, Robbins, & Sahakian, Reference Ersche, Roiser, Robbins and Sahakian2008; Everitt & Robbins, Reference Everitt and Robbins2005). In this view, addictive behaviors may no longer even be based on motivation, but are released automatically (or robotically) by stimuli associated with consumption.
3. Theories about impulse control
Authors have described two general kinds of tactics to counteract impulses: forestalling them in advance and acting while they are present. Means of forestalling changes of preference in advance (for instance, Duckworth, Gendler, & Gross, Reference Duckworth, Gendler and Gross2016) are straightforward, and are not usually counted as forms of willpower. More puzzling have been means that act simultaneously with the impulse, any of which is apt to be called by that name. Willpower is the process of overcoming a seemingly superior, currently available SS reward to get an LL alternative – tacking against the wind, as it were. Published proposals invoke three kinds of mechanisms:
• While intending to wait for an LL alternative, a person may block or otherwise interfere with revaluation that might lead to change of intention, and continue blocking it while the SS reward would be superior. Call this suppression.
• While evaluating immediate options a person may perceive herself to be facing greater incentives than are literally at stake in the current choice. I will argue that resolve is intention that is stabilized by avoiding a perceived risk to such incentives. Authors have described resolve in many different terms, including broad choice bracketing, self-efficacy, high level construal, implementation intention, non-reconsideration, and cognitive re-framing. Sections 3.2.2–3.2.5 will cover how these proposals are related to this perception of risk, sometimes with an admixture of suppression.
• A person may somehow bypass valuation entirely, as is sometimes supposed to occur in habit.
Of course, the pathway to an impulse control mechanism is itself choosable, and thus must originate through a prediction that its value will exceed its cost in the marketplace of reward.Footnote 8 The expected value of control can be conceived as the aggregate of amounts of reward by which the LL course of action will eventually exceed the SS course. The cost has two components: short-term loss and operational expense. The short-term loss is the amount by which the discounted value of the SS option temporarily exceeds that of the LL option – usually greatest when the SS reward is close. This temporary SS-over-LL value defines the motivational force of the impulse. The operational expense is the additional amount of reward, if any, that will be lost by trying to counteract this force (Shenhav and colleagues [Reference Shenhav, Musslick, Lieder, Kool, Griffiths, Cohen and Botvinick2017] propose a taxonomy). Both kinds of costs have sometimes been called effort. In models proposed by some economists effort is simply a reflection of the temporary SS-over-LL value – the short-term loss (Fudenberg & Levine, Reference Fudenberg and Levine2006, p. 1455; Gul & Pesendorfer Reference Gul and Pesendorfer2001, Reference Gul and Pesendorfer2004). But short-term loss is just the size of the challenge. I will use effort to describe only operational expense – the loss of reward from using a particular mechanism of impulse control per se. This expense varies greatly with the details of the three mechanisms that have been proposed.
3.1 Suppression
It is not clear what keeps the steps of even an ordinary intention together from moment to moment, against continual distractions. Such microscopic continuity seems not to have been analyzed in motivational terms, despite being implicit in the many executive functions studied by psychologists (e.g., Miyake & Friedman, Reference Miyake and Friedman2012). Although theorists have sometimes imagined that ongoing behavior is revalued continuously, this would be extraordinarily expensive of cognitive capacity, and should prevent the smooth execution of intentions. Excessive revaluation has been blamed for stuttering, for instance (Civier, Tasko, & Guenther, Reference Civier, Tasko and Guenther2010), poor singing technique (Hoch & Lister, Reference Hoch and Lister2016, pp. 76–78), and probably other forms of self-conscious awkwardness and pathological doubt. On the contrary, moment-to-moment execution seems to occur routinely without revaluations. If you intend to jump over a puddle on the sidewalk, some kind of editor normally suppresses urges that might distract you – to scratch an itch, to glance behind you, even to revalue your choice too late. This is the function which, if you only half want to jump over the puddle, keeps you from only half jumping over it – whichever side wins stiff-arms the other. Such suppression gives intentions a limited flywheel property, like the power of a chairman to defer votes. It does not depend on further valuation; indeed it may depend on avoiding revaluation.
In recent decades, suppression of impulses has taken an outsized role in theories of willpower, probably because it lends itself to experimental manipulation. The marshmallow-type temptation experiments of Mischel and colleagues elicited a subject's intention to wait for an LL food reward, then observed how she tried to avoid the revaluation that would shift her choice to the SS alternative. Subjects' attempts to avoid arousing appetite (“hot thinking”) and to divert attention have stood up in subsequent research as the two basic pillars of suppression (Mischel et al., Reference Mischel, Ayduk, Berman, Casey, Gotlib, Jonides and Shoda2011). To test the limits of suppression, experimenters have often set subjects a task that entails monotonously repeated actions – for instance, press a button if you see x but not y (many examples in Ackerman, Reference Ackerman and Ackerman2011; Hagger, Wood, Stiff, & Chatzisarantis, Reference Hagger, Wood, Stiff and Chatzisarantis2010; Kurzban, Duckworth, Kable, & Myers, Reference Kurzban, Duckworth, Kable and Myers2013). Social psychologists' interest picked up with the finding that subjects' work on an unattractive task apparently reduced how long they performed an unrelated task that required similar behavior (Baumeister, Gailliot, DeWall, & Oaten, Reference Baumeister, Gailliot, DeWall and Oaten2006; Muraven & Baumeister, Reference Muraven and Baumeister2000). Furthermore, if subjects were put to the same task later, they were reported to perform longer than they did previously, a practice effect. Perhaps sustained suppression was willpower!
These authors hypothesized that will is a discrete faculty like a muscle, but with its own sequestered motivation, specialized in maintaining a preference over repeated choices. Some economists also adopted a separately motivated faculty of will – by analogy, sometimes explicitly, to Baumeister's will-muscle (Benhabib & Bisin, Reference Benhabib and Bisin2005; Fudenberg & Levine, Reference Fudenberg and Levine2006; Gul & Pesendorfer, Reference Gul and Pesendorfer2001; Loewenstein et al., Reference Loewenstein, O'Donoghue and Bhatia2015; several discussed in Ainslie, Reference Ainslie2012, pp. 21–26). These models depict a fuel-like motivation that supplements the otherwise inadequate value of the LL option. The motivation was consumed as it operated, just as glucose is consumed by a flexed muscle. The moment-to-moment depletion of this motivation was said to be what was experienced as effort. However, subsequent research has shown that mere expectation of an impending effortful task has the same effect as completing it: Subjects who expect to be in an effortful situation show the same reduction in self-control as if they had already undergone the effort, which rules out literal exhaustion (or “depletion”) as a mechanism (Muraven, Reference Muraven2006). Furthermore, the attenuation effect itself is now in question: Re-running ego depletion procedures while correcting for various sources of bias has produced evidence that the effect is small or even non-existent (Hagger et al., Reference Hagger, Chatzisarantis, Alberts, Angonno, Batailler, Birt and Zwienenberg2016; Xu et al., Reference Xu, Demos, Leahey, Hart, Trautvetter, Coward and Wing1014), although some methodological dispute remains (Friese, Loschelder, Gieseler, Frankenbach, & Inzlicht, Reference Friese, Loschelder, Gieseler, Frankenbach and Inzlicht2019). But to the extent that the attenuation effect is real, it must be simple willingness, not willpower, that is depleted. That is, the suppression task stops being worth the effort.
3.1.1 Why is suppression effortful?
Theorists have struggled to explain the cumulative cost of effort in motivational terms, with the sheer burden of information processing usually found to be an inadequate cause (Shenhav et al., Reference Shenhav, Musslick, Lieder, Kool, Griffiths, Cohen and Botvinick2017). In the laboratory, effort is often studied by the fatigue it accumulates (Ackerman, Reference Ackerman and Ackerman2011). Hockey reviews the long history of fatigue theory, and complains that most authors have been misled by the analogy to engines running out of fuel (Reference Hockey and Ackerman2011). He suggests that the process of self-control might become increasingly aversive because of an “effort monitor”:
Maintaining a specific cognitive goal means necessarily suppressing all others … It is argued that the fatigue state has a metacognitive function, interrupting the currently active goal and allowing others into contention (Hockey, Reference Hockey and Ackerman2011, p. 173).
Boksem and Tops suggest a similar mechanism that evaluates whether “energetical costs … exceed perceived rewards of task performance,” and if so generates “a drive to abandon behavior” that can be called fatigue (Reference Boksem and Tops2008, p. 135). In a more detailed proposal,
many experiences, particularly the more or less unpleasant sensations discussed here (e.g., effort, boredom, fatigue), can be profitably thought of as resulting from (1) monitoring mechanisms that tally opportunity costs, which (2) cause an aversive state that corresponds in magnitude to the cost computed, which (3) enters into decision making, acting as a kind of a “vote,” influencing the decision ultimately taken (Kurzban et al., Reference Kurzban, Duckworth, Kable and Myers2013).
In short, it has been suggested that “mental effort reflects the opportunity costs associated with allocating a valuable but limited resource – the capacity for control” (Shenhav et al., Reference Shenhav, Musslick, Lieder, Kool, Griffiths, Cohen and Botvinick2017, p. 106).
However, in these models effort and the resulting fatigue are said to be mechanisms that protect long-term reward, and their aversiveness grows as time is wasted. But wastes of time do not typically feel effortful, and often do not fatigue. Nor is it clear why a special mechanism is needed to generate aversion to diminished prospects – what would this concept add to avoiding loss of prospects tout simple? By contrast, I would argue that current mental activity is a source of reward in its own right, based on the game-like properties of imagination (Fox et al., Reference Fox, Andrews-Hanna, Mills, Dixon, Markovic, Thompson and Christoff2018; see Ainslie, Reference Ainslie2017), and that its restriction by continuous vigilance against impulses imposes a direct cost. Various much-studied routine tasks are indeed effortful because they occupy your attention, but this simply keeps you from activities that are richer in current reward. Discomfort accumulates while whatever interestingness the task originally had habituates, as shown by its partial relief if some variety is added to the task (Converse & DeShon, Reference Converse and DeShon2009; Hockey, Reference Hockey and Ackerman2011).
In any case, suppression entails operational expense. The very experience of asking yourself whether a particular suppression is worth the effort demonstrates the limited stability of this mechanism: Suppression is subject to intermittent revaluation, so it cannot sustain an intention over long periods of time. In close contests, a drift diffusion model of noisy choice (Pedersen et al., Reference Pedersen, Frank and Biele2017) predicts that LL intentions may get random turns on top, and thus repeated chances to renew suppression, perhaps leading to the common impression that a weaker alternative is holding a stronger one at bay. That is, the threshold for calling on suppression may be lower than the relevant threshold for action. However, reliance on suppression is still just a game of keep-away with SS alternatives, and these can use suppression in turn. To be a robust tactic against impulsive choice suppression must be directed by motivation – which, if it wobbles amid moment-to-moment suppression, must be stiffened by resolve.
3.2 Resolve
In ordinary speech, resolve just means firm intent,Footnote 9 but what makes one intention firmer than another? The connotation is not “riding on a great wave of motivation,” but rather “standing against contrary waves.” That is, resolve is intent that is maintained by an enforcement mechanism. The classical strategy of achieving stable intentions, of “continence,” has been to recruit a set of similar motives that would stand on the side of the intention in question. Referring to dispositions to choose as “opinions,” Aristotle said, “We may also look to the cause of incontinence scientifically in this way: One opinion is universal, the other concerns particulars …” (ca. 350 B.C.E./Reference Aristotle and Barnes1984: Nichomachean Ethics, 1147a, pp. 24–28). It was going by universal “opinions” that made you continent.Footnote 10 For the experts on will in Victorian times the active ingredient was to “unite … particular actions … under a common rule” (Sully, Reference Sully1884, p. 663). This was a process of forming resolve by valuation, of bookkeeping in an open marketplace: “Both alternatives are held steadily in view, and in the very act of murdering the vanquished possibility the chooser realizes how much in that instant he is making himself lose” (James, Reference James1890, p. 534). Weakness of will – akrasia – was failure to think categorically, a deficiency still implicated by modern theorists (Heyman, Reference Heyman1996; Read, Loewenstein, & Rabin, Reference Read, Loewenstein and Rabin1999). Psychologist Howard Rachlin, for instance, has pointed out that seeing particular choices as part of larger, “molar” patterns may in itself predispose the actor to more LL choices, just as someone would be esthetically deterred from changing single notes in a symphony (Reference Rachlin1995). But what gives a molar pattern its edge? These descriptions are agnostic about why someone should have a different preference in a single choice than in a set of similar choices to be made all at once. More importantly, they do not identify what induces – or constrains – a person to view her current choice as part of a larger category, rather than evaluating it by itself.
3.2.1 A behavioral reward model: intertemporal bargaining
A model based on behavioral studies of discounting delayed reward offers an explicit answer to these questions.
Hyperbolic discount curves offer a motivational basis for the two key properties of willpower in classical accounts – that is, of resolve: (1) increased preference for LL alternatives when choosing between whole categories, and (2) incentive to refer individual choices to such categories. (1) In cases where a single SS reward has more present value than an LL alternative, the sum of hyperbolic curves from a whole series (or bundle) of the same SS rewards often has less present value than the summed series of their LL alternatives, even when the first SS reward would be immediate (Fig. 2).Footnote 11 Therefore, uniting bundles of choices “under a common rule” should indeed result in more patience. Valuation of conventionally (exponentially) discounted choices in bundles would not increase LL choice (see Ainslie, Reference Ainslie2001, pp. 81–84 or Reference Ainslie2005, pp. 640–641). (2) The obvious limitation of just assembling a bundle of future choices is that a combination of SS reward in the current choice, to be followed by LL rewards ever after, will always have more prospective discounted value than a series of all LL rewards. A plan that permits the current SS reward will always win. Something needs to enforce the common categorization in the face of immediate temptations. The experience is commonplace. Why not eat this piece of chocolate – it will barely show? Why not dip into savings to buy a fancy car – there will still be plenty left? What would be the harm? The harm, of course, would be to the credibility of your diet or mental savings account (see Thaler & Shefrin, Reference Thaler and Shefrin1981), and thus to your expectation of getting their objectives.
The incentives created by hyperbolic discount curves face you with an intertemporal variant of repeated prisoner's dilemma, with the result that interpretation of your current choice as a test case – as a cooperation or defection – often has more motivational consequence than the outcomes literally at stake (Ainslie, Reference Ainslie1992, Reference Ainslie2005, Reference Ainslie2012).Footnote 12 This intertemporal bargaining centers on the conflict between valuing the present option just for itself versus also valuing the present choice as evidence for how you will choose in a bundle of similar future choices. It does not matter that the negative effects of some habits, such as smoking, do not come repeatedly and soon after the positive ones, hangover fashion, but only in the far future (as Rick & Loewenstein [Reference Rick and Loewenstein2008] have objected). The prospect of future health still forms a stake that is at risk in every choice that the person sees as evidence of her pattern of future choices. Importantly, however, the terms of intertemporal bargains remain fluid, so she can propose changes at the moment of choice – say to allow a cigarette on her birthday – as long as she can distinguish her proposed exceptions from excuses that would be too common (see a discussion of bright lines, Ainslie, Reference Ainslie2001, pp. 94–100).
The test cases in such recursive self-prediction may be defined by explicit self-enforcing contracts (see Telser, Reference Telser1980) – personal rules for what choices would be lapses; or they may just emerge from your vague awareness that you are apt to go on doing what you see yourself do this time. You might conceive the stake in such implicit contracts to be self-esteem, good character, pride in grit, a good relationship with God, the approval of a dead relative, or even the obedience to social instructions – to the extent that those no longer carry external sanctions. But the functionality of such concepts is to be a stake against impulses. For instance, the once-common device of oaths fits this description (Ainslie, Reference Ainslie1975, p. 483; citing Lewis, Reference Lewis1838, pp. 4–9). Thus, recursive self-prediction may take a form that is displaced away from any explicit self-knowledge, further muddying the already controversial definition of “metacognition” (Beran, Perner, & Proust, Reference Beran, Perner and Proust2012; Carruthers, Reference Carruthers2009).
The perceived implications of a given kind of test case are apt to grow with experience, making it part of a web of negotiations between impulses and resolutions that may either reduce or magnify a case's effect. For instance, if a student tries to speak up in class but chickens out, this may seem a minor failure among more important issues. Alternatively, it may bode poorly not only for future speaking attempts, but also for facing her shyness about school in general, or her fear of strangers, or still wider fears. She may notice an incentive not to try again, so as not to put her courage in other situations at risk. She is then apt to identify a boundary to her self-testing – “I can't talk in class” – that describes a circumscribed trait or symptom about which her resolve no longer has any credibility that she can put at stake (see Ainslie, Reference Ainslie2001, pp. 148–149). As a person monitors her attempts to control impulses with recursive self-prediction, she creates a history of successful and failed commitments that entangle her. Her cumulative commitments and failures of commitments are precedents that make her rigid in much the way old economies or bureaucracies become rigid (Olson, Reference Olson1982; see Ainslie, Reference Ainslie2015). Where the impulsive reward itself grows stronger, as after repeated use of some addictive substances (Volkow et al., Reference Volkow, Wang, Fowler, Tomasi, Telang and Baler2010), this encapsulation effect will be especially hard to overcome.
The process of recursive self-prediction that underpins resolve is observable in common experience (see sect. 3.2.5), but has been little discussed. It has been hiding in plain sight, just as the game of prisoner's dilemma itself hid until described in so many words in 1950 (Poundstone, Reference Poundstone1992). Nevertheless, many motivational scientists besides behavioral reward theorists have adopted models that are compatible with recursive self-prediction, and a few include this mechanism explicitly.
3.2.2 Behavioral economics
Most economists' interest in willpower has extended beyond models of suppression, beginning with Laibson's golden goose (Reference Laibson1997) and the game-theoretic model of O'Donoghue and Rabin (Reference O'Donoghue and Rabin1999). The “motivated choice bracketing” of Read et al. (Reference Read, Loewenstein and Rabin1999) is a restatement of the principle that choices are apt to be more patient if made between whole categories of outcomes (“bracketed broadly”) than between single pairs (“bracketed narrowly”). An agent may construct goals (Hsiaw, Reference Hsiaw2013) or reference points (Kőszegi & Rabin, Reference Kőszegi and Rabin2009) that represent expectations about her future choices, and that constrain future behavior by the threat of disappointing these expectations. These concepts move toward an enforcement principle for broad bracketing, because their agents are aware, if “sophisticated,” that larger categories of prospective rewards depend on current choices. Read and colleagues actually mention the test-case contingency in their review of possible mechanisms (Reference Read, Loewenstein and Rabin1999, p. 191). Early in economists' discussions of willpower Bénabou and Tirole accepted the self-enforcing contract model and most of its implications (Reference Bénabou and Tirole2004).Footnote 13 Theirs has been the most complete expression of the recursive self-prediction model in terms of economics, short of wholescale adoption (as in Ross, Sharp, Vuchinich, & Spurrett, Reference Ross, Sharp, Vuchinich and Spurrett2008, pp. 62–75).
3.2.3 Social psychology
In addition to suppression models, there is a vast social psychology literature on willpower, often in other terms – “willpower” was thought until quite recently to be the stuff of self-help books, not a scientific concept. Where willpower has been discussed again, it is mostly described as an “executive function,” strengthened by mental exercises in working memory and inhibitory control, adequate sleep, and mindfulness training (Hofmann & Kotabe, Reference Hofmann and Kotabe2012). Most authors have proposed no mechanisms beyond simple intention, but they sometimes suggest elements that are components of recursive self-prediction. For instance, integrative self-control theory depicts “iterative reprocessing” of valuations when the opportunity for an impulse choice is close, but not the role of the choice as a test case (Kotabe & Hofmann, Reference Kotabe and Hofmann2015). The health belief model includes perceived probability of success as a itself a factor in the success of an intention (Brewer & Rimer, Reference Brewer, Rimer, Glanz, Rimer and Viswanath2015); this dynamic was proposed in general terms in Bandura's concept of “self-efficacy” (Reference Bandura1986). The “active self-regulation” of temporal self-regulation theory demands both “inhibition of pre-potent responses” and “enough behavioral precedent” (Hall & Fong, Reference Hall and Fong2015) – arguably suppression and recursive self-prediction, respectively.
One group of social psychologists developed models that imply, more or less specifically, alliance of the current LL option with a set of other LL options. Trope and Liberman depict a person's viewpoint at greater psychological distances as a higher level of “construal,” more abstract and conducive to impulse control (Reference Trope and Liberman2010). “Abstract” implies categorical or more inclusive, and thus a counting together of more examples. Gollwitzer's “implementation intentions” involve simply declaring an if–then intention, the specificity of which creates a tendency to be followed: “The strategic automaticity created by implementation intentions should free cognitive capacity … behavior is directly controlled by situational cues” (Gollwitzer, Fujita, & Oettingen, Reference Gollwitzer, Fujita, Oettingen and Baumeister2004, p. 213; research reviewed in Gollwitzer & Sheeran, Reference Gollwitzer and Sheeran2006). The authors say that the declaration creates an automatic connection – something like a micro-habit – which seems to stand outside of motivation (sect. 3.3.2); but it could be argued that the specificity focuses resolve, so that the choice is liable to be evaluated as a test case. Certainly such a process occurs when subjects are induced to reconstrue a laboratory temptation task as a “test of willpower,” which is reported to increase their patience (Magen & Gross, Reference Magen and Gross2007). Fujita describes recursive self-prediction in so many words:
[A key] factor appears to be whether people identify a behavior as a unique singular act or representative of a broader pattern… When people focus on what is idiosyncratic and distinct about a situation rather than how that situation is similar to and related to others, they are less likely to consider the broader implications of their actions. As a result, they do not code their behavior as a self-control failure…:If instead, people understand their behavior in terms of a broader pattern, they are more likely to understand that their behavior represents a self-control failure… (Fujita, Reference Fujita2011, p. 360).
3.2.4 Philosophy
Philosophers have dealt with akrasia since ancient times (sect. 3.2). Much of this discussion has revolved around how an agent can be, or can seem, divided (well critiqued in Stroud & Svirsky, Reference Stroud, Svirsky, Allen, Nodelman and Zalta2019). Dealing with impulse control specifically, the most frequent interpretation invokes Watson's concept of intending not to reconsider what a rational decision-maker “in a cool and non-deceptive moment – articulates as definitive of good, fulfilling, and defensible life” (Watson, Reference Watson and Watson2004, p. 25; for instance, Bratman, Reference Bratman1999; McClennen, Reference McClennen, Montero and White2007). However, in Richard Holton's view resolution operates through a sophisticated form of suppression, really precommitment: If you know the pathways by which your revaluation makes comparisons, you should flag dangerous pathways and avoid them early, much as Ignatius Loyola said you should avoid imagining sinful acts (Holton, Reference Holton, Stroud and Tappolet2003, Reference Holton2009, p. 421; cf. Duckworth et al., Reference Duckworth, Gendler and Gross2016). Some authors would include a power to reconsider such resolutions during temptation: Peterson and Vallentyne would allow reconsideration on the basis of rational rules:
Rational resoluteness … is a kind of conditional resoluteness … the disposition to comply with adopted plans when (1) it was rationally permissible to adopt the plan at the time of adoption, and (2) the agent has acquired no new unanticipated information that, if available to the agent at the time of the plan's adoption, would have undermined the rationality of adopting that plan (Ferrero, Reference Ferrero2010; Peterson & Vallentyne, Reference Peterson and Vallentyne2018 argues similarly).
Bratman has revised his earlier advice of non-reconsideration (Reference Bratman1999) to allow resoluteness subject to redefinition – or rationalization – constrained by the fear of regret (Reference Bratman2014), which might include perceived threat to one's ability to use resolve.
Some philosophical writing has depicted impulse control as a procedure rather than a logical judgment. When dealing with the practicality of “synchronic” (simultaneous) self-control, authors have categorized proposed methods as actional and non-actional. Actional methods include such descriptors as blocking, direct inhibition, and distancing, roughly the category of suppression (Sripada, Reference Sripada2014). Non-actional methods entail cognitively re-framing the categories of outcomes that could motivate resolve, for instance, mentally grouping a beckoning temptation with threats rather than with pleasures (Kennett & Smith, Reference Kennett and Smith1997). But as Sripada has pointed out, neither method by itself contains the motivation to be initiated while under the influence of the temptation (Reference Sripada2014). Generally, this motivation has been supposed to be something like “rational pressure in favor of constancy” (Bratman, Reference Bratman2017), which might imply seeing rational rules for self-control themselves to be at stake. Many philosophers have considered a recursive self-prediction model, and some have found their mechanisms compatible with it (e.g., Elster, Reference Elster2015, pp. 270–281; Hanson, Reference Hanson2009, pp. 13–73; McClennen, Reference McClennen and Verbeek2016Footnote 14; Mele, Reference Mele1996; Ross, Reference Ross, Spurrett, Ross, Kincaid and Stephens2007).
3.2.5 Evidence for recursive self-prediction
Authors still complain, “we are unaware of recent empirical research on personal rules as a self-control strategy for students” (Duckworth, Taxer, Eskreis-Winkler, Galla, & Gross, Reference Duckworth, Taxer, Eskreis-Winkler, Galla and Gross2019) – or indeed for anyone. Recursive self-observation is always going to be a challenge for the laboratory, although familiar in common experience.Footnote 15 Where momentary self-prediction is touch-and-go, it will be hard for an outside observer to record. Russell provides an example:
I suspect that I may be getting seasick so I follow someone's advice to “keep your eyes on the horizon” … The effort to look at the horizon will fail if it amounts to a token made in a spirit of desperation … I must look at it in the way one would for reasons other than those of getting over nausea … not with the despair of “I must look at the horizon or else I shall be sick!” To become well I must pretend I am well (Reference Russell1978, pp. 27–28).
Many marginally voluntary processes are modulated recursively by self-observation. Anger, panic, nausea, sleep (in insomniacs), urination (in men with prostatic hypertrophy), and even recalling an elusive memory are promoted by signs that they are already happening, a phenomenon first described by Darwin, James, and Lange. Where the problematic urge is subject to deliberate control, as with a temptation to waste money or take drugs, obeying it is apt to be accompanied by an awareness that “I must expect to go on choosing this,” which may recruit enough motivation to reverse the choice. Experimental examples of this reversal when choosing between bundles have been reported, but the results with short series using relatively small rewards have not been dramatic (Hofmeyr, Ainslie, Charlton, & Ross, Reference Hofmeyr, Ainslie, Charlton and Ross2010; Kirby & Guastello, Reference Kirby and Guastello2001).
The dependence of resolve on a stake of self-expectations is most obvious in the case of relapse into addiction. When someone gives in after a period of successfully resisting temptations, she experiences a sudden, dramatic fall in her perceived ability to resist the next ones, an experience that has been called the abstinence violation effect (for alcoholics, see Curry, Marlatt, & Gordon, Reference Curry, Marlatt and Gordon1987; for dieters, see Polivy & Herman, Reference Polivy and Herman1985; for binge eaters, see Grilo & Shiffman, Reference Grilo and Shiffman1994; for child molesters, see Hudson, Ward, & France, Reference Hudson, Ward and France1992; for smokers, see Shiffman et al., Reference Shiffman, Hickcox, Paty, Gnys, Kassel and Richards1997). True, recovering alcoholics have long believed that they have a biological susceptibility that causes a single drink to lead to irresistible craving; but it has been shown experimentally that it is the belief that they have had a drink of alcohol, not the alcohol itself, that is followed by craving (Maisto, Lauerman, & Adesso, Reference Maisto, Lauerman and Adesso1977).
The reader can verify the specific role of self-expectation by thought experiments such as Monterosso's problemFootnote 16:
Consider a smoker who is trying to quit, but who craves a cigarette. Suppose that an angel whispers to her that, regardless of whether or not she smokes the desired cigarette, she is destined to smoke a pack a day from tomorrow on. Given this certainty, she would have no incentive to turn down the cigarette – the effort would seem pointless. What if the angel whispers instead that she is destined never to smoke again after today, regardless of her current choice? Here, too, there seems to be little incentive to turn down the cigarette – it would be harmless. Fixing future smoking choices in either direction (or anywhere in between) evidently makes smoking the dominant current choice. Only if future smoking is in doubt does a current abstention seem worth the effort. But the importance of her current choice cannot come from any physical consequences for future choices; hence the conclusion that it matters as a precedent (Monterosso & Ainslie, Reference Monterosso and Ainslie1999).
The difficulty of observing recursive self-prediction experimentally is illustrated by a recent attempt in an economics laboratory: Two hundred adult subjects performed long, boring tasks on two successive occasions a week apart. Two days before each task they were asked to say how much of it they intended to perform, and to guess how much they actually would perform. Before the first task the subjects were asked how much they expected to correct their intentions/estimates for the second task between tasks, expectations that would show an awareness of “the autocorrelation of intertemporal decisions” (Yaouanq & Schwardmann, Reference Yaouanq and Schwardmann2019). They showed little of this awareness. However, subjects were not told to control themselves, so they probably did not see their reported intentions as resolutions. In any case, this has been the only experiment so far to try making self-prediction externally visible.
3.2.6 Resolve may entail a different kind of effort
Effort is the operational expense of impulse control. Resolve becomes effortless to the extent that you are confident of maintaining it. Even with temptations that arouse an appetite, the unambiguous belief that you will never give in can make impulse control easy. In natural experiments, Dar and colleagues have found that Orthodox Jews who never smoke on the Sabbath and flight attendants who never smoke during flights have no urge to smoke during those times, while still having strong urges at other times (Dar, Stronguin, Marouani, Krupsky, & Frenk, Reference Dar, Stronguin, Marouani, Krupsky and Frenk2005; Dar, Rosen-Korakin, Shapira, Gottlieb, & Frenk, Reference Dar, Rosen-Korakin, Shapira, Gottlieb and Frenk2010). Such examples elucidate willpower-as-resolve – the perception of incentives that commit you, even when, as with the religious, you choose and maintain the incentive structure yourself. People high in the self-reported trait of self-control have reported fewer problematic desires in their everyday lives, and they make conscious use of self-control less often (Tangney, Baumeister, & Boone, Reference Tangney, Baumeister and Boone2004).
On the other hand, marginally permissible temptations create an operational cost for resolve – the stress of managing the risk to a broader category of expected reward implied by a current choice. This stress occurs to the extent that the membership of your current choice in a bundle of SS/LL choices is open to doubt – that is, where the SS option in a current choice is a somewhat credible exception to your rule. In that case you face a legalistic task, the cost of which is not only the attention demanded by the required argument but also facing the danger that you may lower your prospect of a bundle of LL reward, if you claim an exception and later find that you have fooled yourself. Then you are at risk of an abstinence violation effect, or perhaps just a lower prospect of getting your long-term reward. This loss may provoke regret or guilt. Therefore, negotiation with competing options is sometimes also called an effort – for instance, in William James' famous discussion of a drunkard's excuses for drinking: “The effort by which he succeeds in keeping the right name [‘being a drunkard’] unwaveringly present to his mind proves to be his saving moral act” (Reference James1890, p. 565, his emphasis; see also Hockey, Reference Hockey and Ackerman2011, pp. 174–177).
We have many SS/LL conflicts that do not rise to awareness during the average day. “The lion's share of our everyday desires does not stand in conflict with our values and self-regulatory goals” (Hofmann & Kotabe, Reference Hofmann and Kotabe2012). However, even prosaic choices often have conflictual histories. The truce lines of old battles (sect. 3.2.1) become unremarkable, even when large incentives are at stake – life savings, the risk of cancer, beliefs about personal identity (Berkman, Livingston, & Kahn, Reference Berkman, Livingston and Kahn2017). Hundreds of small intertemporal conflicts are similarly avoided mindlessly: Someone may variously wait until after dinner to eat dessert, do the more boring of two tasks first, put on a condom, pick up a fallen object as soon as it falls, and make other categorical responses that were once formed to combat the pervasive incentive to procrastinate. Their status as intertemporal bargains is evidenced only by the unease that comes from not performing them, which can be attributed, in turn, to the asymmetrical damage done to prospective impulse control.Footnote 17 Similarly, Fujita points out that with successful reconstruals and implementation intentions, “no temptation impulse should be experienced” (Reference Fujita2011, p. 359). Significantly, he alludes at several points to self-control occurring “without conscious deliberation (p. 355).” These patterns sound like habits, a word that is coming back into vogue.
3.3 Habit is an outcome, not a mechanism
“Habit” has been put forward in the recent self-control literature as the most successful impulse control strategy (Carden & Wood, Reference Carden and Wood2018; Gillebaart & Adriaanse, 2017; Neal, Wood, & Drolet, Reference Neal, Wood and Drolet2013). However, this usage is misleading. To discuss the role of habit in self-control, we should first distinguish three kinds: routine habits, good habits, and bad habits.
3.3.1 Routine habits
These are subroutines that you learn for navigating familiar paths to reward with a minimum of attention. Repeatedly rewarded behaviors get more and more efficient and require less and less attention. We use many of these to get dressed and drive to work while thinking of something else. Engagement in a habit is accompanied by a shift of neural activity in midbrain striatal areas from “planning” or “voluntary” to “habitual” systems, which has been suggested to imply a committing effect (Everitt & Robbins, Reference Everitt and Robbins2013). A similar shift has been described from “goal-directed” or “model-based” to “model-free” systems (Voon et al., Reference Voon, Derbyshire, Rück, Irvine, Worbe, Enander and Bullmore2015).Footnote 18 However, the habitual or model-free system does not hold the process of choice captive. Brain imaging shows flexible transitions between these processes (Gershman, Markman, & Otto, Reference Gershman, Markman and Otto2014; Kool, Cushman, & Gershman, Reference Kool, Cushman, Gershman, Morris, Bornstein and Shenhav2018; Otto, Gershman, Markman, & Daw, Reference Otto, Gershman, Markman and Daw2013), and there is electroencephalography (EEG) evidence that these systems stay in operation simultaneously (Sambrook, Hardwick, Wills, & Goslin, Reference Sambrook, Hardwick, Wills and Goslin2018). Most importantly, multiple attempts to make human subjects resistant to new learning through sheer repetition have overwhelmingly failed (de Wit et al., Reference de Wit, Kindt, Knot, Verhoeven, Robbins, Gasull-Camos and Gillan2018). In normal subjects, any contrary incentive restores the model-based system – You can easily put on clothes in a different order or take a different route to work if you just pay attention. Although routinely habitual behaviors are sometimes called automatic or robotic, “mindless” would better characterize their persistence without having momentum.
Some authors have proposed that brain damage from addiction may make routine habit resistant to change, thus preserving it as an explanation for why addictions persist in the face of contrary incentives (e.g., Everitt & Robbins, Reference Everitt and Robbins2005, Reference Everitt and Robbins2013). In making frequent choices to get small amounts of money in the laboratory, addicts have been observed to show more model-free behavior than non-addicts (Voon et al., Reference Voon, Derbyshire, Rück, Irvine, Worbe, Enander and Bullmore2015). However, this difference has been small, as it has been even in patients with gross lesions in the brain centers active during choice (Fellows & Farah, Reference Fellows and Farah2005). A recent review of the literature about inflexible (“stimulus-bound”) habit in humans found small increases in subjects with several kinds of psychopathology, but could not distinguish in those subjects between “excessive habit formation [and] weak goal-directed control” (Watson & de Wit, Reference Watson and de Wit2018, p. 35).Footnote 19 We might wonder whether slightly decreased flexibility of choice between small rewards in the laboratory reflects inability to weigh the major consequences of addiction.
3.3.2 Good habits
These are those behavior patterns preserved by resolve – keeping a diary every night, jogging every day, or getting out of bed when the clock radio plays a certain theme every morning. The resolve need not be deliberate, perhaps just a sense that you won't go on making a particular choice if you don't do it this time. You can tell that a habit is good rather than routine when a very few choices in the contrary direction are sufficient to change it. Because of this, you sense that you need an excuse to skip it on a particular day, lest it be harder to begin again. Accordingly, you feel a rush of pleasure when an external circumstance prevents you from doing it today. This rush of pleasure is evidence that the habit is not something you simply prefer; nevertheless, abandoning or “breaking” the habit feels like a loss. Of course, when you do not expect much benefit from the habit, the pleasure or the loss will be small. The habits that subjects choose casually for an experiment do not elicit the amounts of differential motivation at play in addictions or tests of character. Habits such as always drinking a bottle of water with lunch or eating fruit (as in Lally, Van Jaarsveld, Potts, & Wardle, Reference Lally, Van Jaarsveld, Potts and Wardle2010) shade into routine habits such as always dressing in a particular order or taking a particular route to work, the benefit being just not having to stop and choose. Sobriety may be a routine habit for someone who is not tempted to drink too much, but a good habit of great significance for a recovering alcoholic.
Some authors lump even good habits of great consequence together with routine habits. They point out that a person who habitually resists a temptation in a particular circumstance stops feeling tempted there – never thinks of smoking during a flight, for instance. Therefore, highly credible resolve does engender a routine habit of sorts, to avoid considering rewards that will never happen. But this habit will persist routinely only as long as the tempting reward indeed does not happen.Footnote 20 The important question is how such abstention is achieved to begin with.
In a recent review, Duckworth and colleagues comment, “the conceptual parallels between plans, personal rules, and habits may belie antagonistic underlying processes” (Reference Duckworth, Taxer, Eskreis-Winkler, Galla and Gross2019). However, the argument I have presented is that good habits require intertemporal bargains – the motivation for an LL choice in their specific context by fear of breaking up the pattern of LL choices on which a larger reward is seen to depend. “If I don't study (or go running, or …) at eight o'clock today, I'll be less likely to do it tomorrow.” Admittedly, when this logic has prevailed for some time a person will stop going through it, and behave mindlessly. It can even look as although “action control is transferred to environmental stimuli” (Lally, Wardle, & Gardner, Reference Lally, Wardle and Gardner2011). But the crucial factor in a good habit is not the frequency of repetitions but the infrequency of lapses – instances of non-performance without an excuse. The notion of excuses is meaningless for routine habits, but is at the very heart of the intertemporal bargain in good habits. If I don't go running when it's stormy, or don't study when I have to supervise my sister, the strength of my good habit shouldn't be affected. But if I just don't feel like doing it, or reach too far for excuses, my motivation will soon come down to the whim of the day, even if I've run or studied a great number of times before. When I have lost the protection of confident resolve – perhaps experienced as “ingrained” habit – I will pass into a middle ground: Impulse control now takes effort (Galla & Duckworth, Reference Galla and Duckworth2015), in the sense either of tenuous intertemporal bargaining or increased use of suppression or both in tandem. Or I may abandon the good habit altogether.
The asymmetrical vulnerability of good habits to lapses has long been known. “Every gain on the wrong side undoes the effect of many conquests on the right” (Bain, Reference Bain1859/1886, p. 440). To extinguish your weighing of alternatives you have to choose consistently over many trials, or, rarely, discover a radically new way of evaluating your rewards – reported sometimes by addicts who quit overnight (Heyman, Reference Heyman2009; Miller & C'de Baca, Reference Miller and C'de Baca2001; Premack, Reference Premack and Hunt1970, p. 115). Before a good habit starts to feel routine, there is usually a long period where temptations arise but are deterred by a recognition that they are test cases – that is, by resolve. Therefore, the good habits that have been recently proposed as an effortless alternative to willpower (Carden & Wood, Reference Carden and Wood2018; Duckworth et al., Reference Duckworth, Taxer, Eskreis-Winkler, Galla and Gross2019; Gillebaart & Adriaanse, 2017; Neal et al., Reference Neal, Wood and Drolet2013) are actually a form of willpower, and are effortless only when unchallenged – either by an unusually strong temptation or by ordinary temptations that come with middling-good excuses.
3.3.3 Bad habits
These are just impulsive behaviors that occur repeatedly. Although someone may call an activity that she actually prefers a bad habit – cracking her knuckles or putting her feet on the furniture, or even drinking too much or smoking – the term has motivational meaning only when she would prefer at a distance to avoid the behavior. She may never have tried to control it, or may have come to terms long ago with failing to do so. However, a new failure may endanger her resolve in other areas, as described in section 3.2.1. This risk is apt to deter attempts at breaking bad habits. Too many failures may snowball into lost credibility for almost any resolve, as in some cases of addiction, a bankruptcy that in combination with the cumulative dopaminergic potentiation of addictive reward (Volkow et al., Reference Volkow, Wang, Fowler, Tomasi, Telang and Baler2010) might fairly be called a disease (discussed in Ainslie, Reference Ainslie, Poland and Graham2011).
3.4 The functional relationship of resolve, suppression, and habit
In earlier writings where I described recursive self-prediction and its consequent intertemporal bargaining, I imagined resolve to be synonymous with willpower (Ainslie, Reference Ainslie1975, Reference Ainslie1992, Reference Ainslie2001, Reference Ainslie2017), so I made no attempt to relate it to suppression or habit. Recent proposals about habit and recent reports of brain imaging have suggested a way to integrate the three phenomena. Essentially, habit reflects bargains between impulses and resolutions that are no longer contested, and suppression is not only an ad hoc device but also a tool to help implement resolve. That is, resolve and suppression are symbiotic, in that suppression has only local effect without resolve, and the implementation of resolutions can be augmented against momentary urges by suppression. The only one of these strategies that is intrinsically effortful is suppression, but intertemporal bargaining may sometimes become effortful either by costing a great deal of attention or by evoking fear for your larger expectations of self-control. What brain imaging has been done on willpower is consistent with this view, and to some extent actually suggests it.
4. Evidence from brain imaging
Some aspects of impulse control have become visible to functional magnetic resonance imaging (fMRI) and EEG in humans, and to microelectrode recording in primates. The hyperbolic shape of the underlying delay discount curve seems to be well supported by fMRI of reward areas, not just when subjects choose money at delays of weeks (Kable & Glimcher, Reference Kable and Glimcher2007), but also when they choose small amounts of money at delays of seconds – periods so short as to suggest the prizes are not just secondary rewards but primary, game-created prizes (Wittman et al., Reference Wittman, Lovero, Lane and Paulus2010).
A subject's awareness of SS/LL choice seems to induce reduction of relative SS value even when no outcome depends on it. At least, young American adults choose LL rewards more than would be expected from activity observed in brain reward centers when the same outcomes are anticipated singly (Luo, Giragosian, Ainslie, & Monterosso, Reference Luo, Giragosian, Ainslie and Monterosso2009). This finding suggests a readiness to counter impulsiveness in the presence of intertemporal contingencies per se, but does not reveal a mechanism.Footnote 21 Actual trials of willpower evoke suppression, as was pointed out above (sect. 3.1). They are attended by increased activity in particular centers, especially in the dorsolateral prefrontal cortex (dlPFC; Figner et al., Reference Figner, Knoch, Johnson, Krosch, Lisanby, Fehr and Weber2010; Hall & Fong, Reference Hall and Fong2015; Kober et al., Reference Kober, Mende-Siedlecki, Kross, Weber, Mischel, Hart and Ochsner2010; Luo, Ainslie, Pollini, Giragosian, & Monterosso, Reference Luo, Ainslie, Pollini, Giragosian and Monterosso2012). In a primate study minutely monitoring attention during a food-getting task, dlPFC activity was observed to accompany suppression of distracting stimuli (Suzuki & Gottlieb, Reference Suzuki and Gottlieb2013). In humans, transcranial magnetic stimulation of the dlPFC in real time increases LL choice (Cho et al., Reference Cho, Ko, Pellecchia, Van Eimeren, Cilia and Strafella2010), and its disruption increases SS choice (Figner et al., Reference Figner, Knoch, Johnson, Krosch, Lisanby, Fehr and Weber2010). The observation that subjects' valuations of the alternatives stayed the same during the procedures in the latter two studies implies that dlPFC activity need not change valuations to be effective (but see Hare, Camerer, & Rangel, Reference Hare, Camerer and Rangel2009); rather, a direct self-control process may be occurring (see Scheres, De Water, & Mies, Reference Scheres, De Water and Mies2013).
In humans, EEG that allows tracking over milliseconds has shown two specific steps in suppression: A food-temptation experiment shows LL choices to begin with “attention filtering,” followed, still within half a second, by “value modulation” – suppression of reward center activity – both of which are moderated by the dlPFC as located electronically by distributed Bayesian source reconstruction (Harris, Hare, & Rangel, Reference Harris, Hare and Rangel2013). The short latency of both kinds of responses from the presentation of the options indicates that they are part of the decision itself. A step-by-step description of a subject's choice would thus be: (1) intention to exert control at a given moment, then (2) filtering attention, (3) inhibiting appetite, and (4) behavioral response.
Moving beyond mere localization, it is now possible to detect the functional connectivity of the dlPFC with reward-related centers as subjects resist temptations in real time. dlPFC activity is accompanied by reduction of activity in the ventromedial (vmPFC) and orbital PFCs (Hare et al., Reference Hare, Camerer and Rangel2009; Hare, Hakimi, & Rangel, Reference Hare, Hakimi and Rangel2014; Lim et al., Reference Lim, Cherry, Davis, Balakrishnan, Ha, Bruce and Bruce2016; Monterosso & Luo, Reference Monterosso and Luo2010). In a recent example of smokers who were trying to quit, only those whose brains showed connectivity between the dlPFC and the insula during an actual chance to smoke were able to resist it (Zelle, Gates, Fiez, Sayette, & Wilson, Reference Zelle, Gates, Fiez, Sayette and Wilson2017). Clinically minded experimenters have even begun to use a newly-developed biofeedback technique based on fMRI to teach increased functional connectivity between the dlPFC and vmPFC (Spetter et al., Reference Spetter, Malekshahi, Birbaumer, Lührs, van der Veer, Scheffler and Hallschmid2017); they report that it reduces high-calorie food choices.Footnote 22
Because resolve is a matter of framing and monitoring choices, it might not be accompanied by measurable brain activity any more than other semantic content is. However, to the extent that resolve permits a given amount of LL choice to be made with less suppression, its operation should be reflected in reduced activity in the dlPFC and other centers that filter attention or inhibit appetite. Certainly, such a reduction occurs with physical commitment to LL choice: Male subjects who could choose higher-valued erotic images after delays of up to 10 seconds versus less-valued images immediately, in one condition could choose to commit themselves to wait, and in another condition had both options continuously open (Crockett, Braams, Clark, Tobler, & Robbins, Reference Crockett, Braams, Clark, Tobler and Robbins2013). Counting only the trials that resulted in LL choice, the authors found less dlPFC activity both while a subject chose commitment and afterward.
Another finding from the same experiment points to where active choice of impulse control may be observable: Subjects showed increased activity in the frontal cortical pole specifically while a subject was choosing the commitment option. In a similar temptation experiment, stimulation of the frontal pole by transcranial direct current (tDCS) increased subjects' choice of the commitment option, while having no effect on choice rates when uncommitted (Soutschek et al., Reference Soutschek, Ugazio, Crockett, Ruff, Kalenscher and Tobler2017). The frontal pole has been implicated in the highest levels of abstraction (Smith, Monterosso, Wakslak, Bechara, & Read, Reference Smith, Monterosso, Wakslak, Bechara and Read2018). The foregoing experiments suggest that it is active in planning impulse control but not in suppression, and thus might be a candidate for formulating and monitoring the intertemporal bargains that form resolve. Scenarios created in episodic memory areas might also serve this function. They are widely reported to be involved in counteracting the overvaluation of the near future (Benoit, Gilbert, & Burgess, Reference Benoit, Gilbert and Burgess2011; Bulley, Henry, & Suddendorf, Reference Bulley, Henry and Suddendorf2016; Peters & Büchel, Reference Peters and Büchel2010; Schuck, Cai, Wilson, & Niv, Reference Schuck, Cai, Wilson and Niv2016). The question that needs follow-up is whether internal commitment by intertemporal bargains has the same reducing effect on dlPFC activity as external commitment has.
Some recent experiments are steps in this direction. In intertemporal choices of cash, reframing subjects' options just by showing each zero-paying alternative reduced dlPFC activity during LL choice while also increasing occurrence of this choice (Magen, Kim, Dweck, Gross, & McClure, Reference Magen, Kim, Dweck, Gross and McClure2014). The authors called the frames that listed the zero-pay events “sequences,” even though the same two single outcomes were being compared. The authors' original concept was to evoke people's well-known preference for improving sequences of outcomes (Magen, Dweck, & Gross, Reference Magen, Dweck and Gross2008), but it seems more likely that it suggested abstract and perhaps budgetary decision bases. In any case, just listing the zero-pay outcomes has been confirmed to increase LL choice, to increase activity in “imagination centers” and decrease activity in the dlPFC and caudal ACC during LL choice (Jenkins & Hsu, Reference Jenkins and Hsu2017). These experiments tested just preference, not impulse control, but they suggest how re-framing can reduce the role of the dlPFC while increasing LL choice.
These results support separate roles for valuation and suppression in impulse control. Next, we need to look at brain imaging specifically during resolve: testing whether frontal pole and/or imagination center activity was high, and dlPFC activity low, during internal commitment, that is, during commitment by an intertemporal bargain. Such testing would first require comparison of stand-alone SS/LL reward choices versus actual bundles of these choices. If LL choice was greater in the bundle condition, we could then measure brain activity during an SS/LL choice that the subject was apt to see as a test case for a larger bundle, and compare it with activity in a condition where she would not take this view. Suggesting such a view with respect to arbitrary bundles, as in Kirby and Guastello (Reference Kirby and Guastello2001) and Hofmeyr et al. (Reference Hofmeyr, Ainslie, Charlton and Ross2010), would probably again produce a small difference; but it would be difficult in the laboratory to call on a subject's real life test cases, such as the moral and characterological choices envisioned by Bodner and Prelec (“self-signaling” – Reference Bodner, Prelec, Brocas and Carillo2003). A creative experimenter might look for examples where an existing strongly held rule was time-dependent – not to smoke on the Sabbath or eat meat on a Friday – and measure a subjects' PFC activity when confronted with temptations on the different days.
5. Conclusions, in evolutionary context
In human evolution, the influence of future expectations on current preference has been at least as great an advance as speech, tool use, or theory of mind, and it is ultimately a resource for all of those. Until the emergence of foresight, contingencies that were at all remote shaped behavior only by the natural selection of inborn instincts, for instance those that attached present reward to the necessary components of migrating, nesting, and reproducing. The role of foresight was limited by organisms' capacity to detect contingencies – associations of events – spanning more than seconds to minutes.Footnote 23 Bigger brains meant more foresight, but even the great apes still show signs of looking ahead for no more than a few hours, for instance in anticipating the use of a tool (Mulcahy & Call, Reference Mulcahy and Call2006; Osvath, Reference Osvath2009).
It once seemed that long-term choice was simply a quantitative development: the evolution of more powerful predictive ability that could detect reward differentials when they were attenuated by longer delays. However, adaptation to increases of time scale turns out to need more than an increase of predictive power. As with so many evolutionary metrics – wing span, leg strength, heat dissipation – a vast increase in scale has introduced at least one qualitatively different problem. To the extent that an organism replaces instinctive preference with foresight, effective reward-getting demands consistent preference over time. The inherited process by which delayed prospects attract vertebrates' preferences does not itself produce this consistency. Data from a range of species show that the internal market value of a delayed prospect is discounted in inverse proportion to that delay – hyperbolically – as if this function had been simply copied from other psychophysical functions for assessing quantities such as weight, brightness, and temperature (Gibbon, Reference Gibbon1977).
Orthodox theory holds that hyperbolic discount functions are maladaptive on their face and thus should have been selected out in evolution. However, the fact remains that nonhuman animals regularly show preference for SS over LL rewards, temporarily. They are often motivated to suppress this imminent preference (as at arrow in Fig. 1B): A dog waiting for a fetch signal or a rat facing shock on the path to food can be seen straining against urges. A pigeon rewarded with grain for not pecking a key over a few seconds can be seen pecking at the wall next to the key or turning around during that time (Ainslie, Reference Ainslie1974), behaviors similar to Mischel's 4-year-olds trying not to eat the marshmallow (Mischel & Ebbeson, Reference Mischel and Ebbeson1970). These are clearly effortful behaviors, and even pigeons can learn to increase them up to a point (Reference AinslieAinslie, Reference Ainslie1982), but such mechanisms do not offer even moderately long-term stability. They are easy to study in the laboratory and are reliably accompanied by dlPFC activity, making suppression the experimental paradigm of impulse control. But suppression is only one route to willpower.
Philosophical opinion from Aristotle on down has been that impulses are best managed when the current choice appears inseparable from a larger category of choices. Hyperbolic discount curves describe both temporary preferences (Fig. 1B) and the potential effectiveness of discerning test cases for series of similar choices (Fig. 2), the use of which is here argued to be resolve. The logic of intertemporal bargaining also determines how effortful resolve will be. When a person sees that a rule defines a clearly dominant strategy, choice should become regular and effort should not arise. But where the rule can be argued various ways, the resulting doubt and attempts to overcome it create a cost that could also be called effort, although of a different kind than that of suppression. A successful bargain will come to be experienced as an effortless habit, but habit is not itself a mechanism of consistency.
Brain imaging is well adapted for tracking suppression, but is just starting to suggest processes accompanying resolve. In SS/LL choice experiments, subjects' choices of precommitment are accompanied by reduced brain dlPFC activity, but this has been studied only in the case of external precommitment. Resolve is hard to study in the laboratory, not least because it has implications for the whole web of an individual's intertemporal bargains. However, reports of frontal polar and default area activity during choice of precommitment suggest that these areas may also take part in resolve.
Acknowledgments
I thank John Monterosso, Jon Elster, and André Hofmeyr for their comments on earlier drafts of the text.
Financial support
This paper is the result of work supported with resources and the use of facilities at the Department of Veterans Affairs Medical Center, Coatesville, PA, USA. The opinions expressed are not those of the Department of Veterans Affairs or of the US Government.
Conflict of interest
I have no conflicts of interest.
Target article
Willpower with and without effort
Related commentaries (26)
Aspiration fuels willpower: Evidence from the addiction literature
Beyond willpower
Evolving resolve
Increasing resolution in the mechanisms of resolve
Is resolve mainly about resisting hyperbolic discounting?
Is “willpower” a scientific concept? Suppressing temptation contra resolution in the face of adversity
It's not a bug, it's boredom: Effortful willpower balances exploitation and exploration
More dynamical and more symbiotic: Cortico-striatal models of resolve, suppression, and routine habit
Pleas for patience from the cumulative future self
Present-state dependency in valuation of the future
Putting the pieces together: Self-control as a complex interaction of psychological processes
Resolve is always effortful
Self-control (or willpower) seeks to bias the resolution of motivational conflicts toward an individual's long-term interests
Self-control from a multiple goal perspective of mixed reward options
Self-organization of power at will
Socializing willpower: Resolve from the outside in
Stress and imagining future selves: resolve in the hot/cool framework
Suppression, resolve, and habit in everyday financial behaviour
The complex nature of willpower and conceptual mapping of its normative significance in research on stress, addiction, and dementia
Weighting on waiting: Willpower and attribute weighting models of decision making
When will's wont wants wanting
Willpower is a form of, but not synonymous with, self-control
Willpower is overrated
Willpower needs tactical skill
Willpower through cultural tools: An example from alcoholics anonymous
Willpower without risk?
Author response
Reply to commentaries to willpower with and without effort