In January 2018 the United States suspended security assistance to Pakistan “until the Pakistani Government takes decisive action” against the Afghan Taliban and the Haqqani Network whose members “continue to find sanctuary inside Pakistan as they plot to destabilize Afghanistan and also attack US and allied personnel” (Nauert Reference Nauert2018). Commenting on the mounting tensions, former ambassador to Afghanistan and Pakistan Ryan Crocker explained Pakistan’s perspective on US–Pakistan relations:
The Pakistanis have their own narrative about the relationship—that once the Soviets were defeated in Afghanistan at the end of the 1980s, we [the Pakistanis] went from being their most allied of allies [of the Americans] to their most sanctioned of adversaries. So they tend to be very, very defensive and very worried that the US will walk out on them again… To put it as briefly as I can, it’s—well, we’re glad you’re back, you Americans. We’re going to take what we can get as long as you’ll give it. But we know you’re not going to stay the course. So if you expect us to go in full throttle turning the Taliban into an enemy and then leave us with an existential threat, you’re nuts.Footnote 1
A crucial question at the heart of these statements is: To what extent can the United States induce Pakistan to act in ways that further American interests? Casting the question more broadly, what factors affect the ability of one state to influence the behavior of another often much weaker state?
Recent work on counter-insurgency, client states, foreign aid, and proxy wars has adopted a principal–agent framework to study this question.Footnote 2 The principal, often the United States, is typically trying to induce an agent to take steps to mitigate a threat to the principal. Biddle, MacDonald, and Baker, for example, describe “security force assistance” in which the United States provides training, advising, and equipment to allied militaries as a “classic” principal–agent problem. “[T]he United States is the principal, the ally receiving the aid is the agent, and the principal’s aim is to meet a threat to American security more cheaply than by sending a large US ground force to do the job directly” (Reference Biddle2017, 128). More generally, Padró-i-Miquel and Yared (Reference Padró-i-Miquel and Yared2012) and Berman and Lake (Reference Berman, Lake, Padró-i-Miquel, Yared, Berman and Lake2019) use a principal–agent approach to study the political strategy of indirect control, i.e., “getting local leaders to act in sometimes costly ways” to advance another state’s interests (Berman et al. Reference Berman, Lake, Padró-i-Miquel, Yared, Berman and Lake2019, 3).
The principal–agent framework highlights three important factors that affect the ability of the principal to get the agent to act on the principal’s behalf: the degree to which the principal’s goals and interests diverge from the agent’s; the severity of the moral hazard problem and the extent to which the principal can monitor the agent’s actions; and the limited ability of the principal and agent to make credible commitments.Footnote 3 The greater the divergence, the more costly it will be for the agent to act as the principal wants. As a result, the principal will have to offer higher rewards or threaten harsher punishment if it is to induce the agent to act (Berman et al. Reference Berman, Lake, Padró-i-Miquel, Yared, Berman and Lake2019). The less able the principal is to monitor the agent’s actions, the more difficult it will be to reward “good” behavior, punish “bad” behavior, and influence what the agent does.
Finally, credible rewards and punishments are at the core of the principal–agent relationship. In weakly institutionalized settings like the international system or in states where the rule of law is weak, the latent contract between the principal and agent must be self-enforcing. Enforcement typically relies on implicit threats of future punishment. If one party fails to uphold its part of the agreement, the other party will “punish” it severely enough to make the long-run cost of reneging outweigh the short-run gains. This punishment may take the form of the punisher’s not complying with its part of the agreement for a period of time or of harsher measures like sanctions or military action. This of course is the basic mechanism underlying the cooperative equilibrium of the repeated prisoner’s dilemma and of ongoing interactions more generally. Indeed, Padró-i-Miquel and Yared (Reference Padró-i-Miquel and Yared2012) explicitly model the problem of indirect control as a repeated moral-hazard game.
This paper examines a related but fundamentally different incentive problem that undermines the ability of a principal to induce an agent to exert significant effort on its behalf even if there is perfect monitoring and complete information. It is the incentive problem at the core of Crocker’s comments. The repeated-game’s enforcement mechanism tends to break down if the principal is trying to get the agent to resolve a problem that, if resolved, (i) creates or considerably exacerbates a problem for the agent that results in an ongoing cost and that (ii) simultaneously eliminates or significantly reduces the agent’s ability to impose future costs on the principal.
The first condition, incurring an ongoing cost, means that the agent’s total cost of doing what the principal wants, e.g., turning the Taliban into an enemy as in Crocker’s account, may be very high when the actors are forward-looking and patient. The principal has to cover these costs if it is to induce the agent to act. But paying the agent in advance runs the risk that the agent will pocket the payment but not follow through on its part.
By limiting the amount of punishment the agent can impose on the principal for non-payment, condition (ii) undermines the principal’s ability to pay the agent off after the agent resolves the principal’s problem. The Pakistanis, in Crocker’s summary, cannot prevent the United States from leaving them to face the existential threat they have just created. The net result of (i) and (ii) is that the principal cannot induce the agent to exert much effort in solving the problem, and the problem persists.
The incentive problem studied here goes beyond the case of Pakistan. For example, it frequently plagues American efforts to use security force assistance to strengthen an ally’s military capacity to deal with transnational threats. Biddle et al. describe the issue:
[T]he kind of powerful, politically independent, technically proficient, non-corrupt military the US seeks is often seen by the partner state as a far greater threat to their self-interest than foreign invasion or terrorist infiltration. Increased military capability destabilizes the internal balance of power; diminished cronyism and corruption weaken the regime’s ability to control the empowered officers (Reference Biddle2017, 100).Footnote 4
In short, doing what the United States wants creates a problem for the other state. Moreover, that state must be concerned that once it “has reformed as the principal wanted and has accepted the associated internal risks, the apparently indifferent Americans may pocket the benefits to US interests but then walk away and withhold critical assistance in the event of internal crisis” (Reference Biddle2017, 130).
More generally, the incentive problem at issue here impedes any attempt to induce an incumbent regime to effect changes that weaken that regime’s hold on power or reduce its rents from holding office. Examples include using aid to get authoritarian leaders to undertake democratic reform (Wright Reference Wright2009). A closely related incentive problem exists in arms control agreements aimed at creating very long break-out periods as was the case with the Joint Comprehensive Plan of Action (JCPOA) limiting Iran’s nuclear program and with the long-standing United States’ goal of “complete, verifiable, irreversible dismantlement” with respect to North Korean’s nuclear program.Footnote 5
We study this incentive problem in an infinite-horizon game between a principal and an agent in which the principal is trying to resolve a problem that imposes costs on it for as long as the problem persists. The agent, perhaps because of superior local knowledge, can deal with the principal’s problem at a lower cost. However, resolving the principal’s problem creates an ongoing cost for the agent. When the efficient outcome is for the principal to pay the agent to solve the problem, commitment problems may nevertheless make it difficult for the actors to realize the efficiency gains.
At the start of each round of the game, the principal can decide to deal directly with the problem by taking matters into its own hands. Or, the principal can choose to deal with the problem indirectly by making a transfer to the agent in an attempt to induce the agent to deal with the problem. If the principal makes a transfer, the agent pockets the transfer and then decides how much effort to exert. Formally, the agent chooses the probability of resolving the problem. The game ends if the principal decides to deal with the matter directly or if the agent’s efforts are successful. If the agent’s efforts are unsuccessful, the next round begins with the principal again deciding whether to deal with the problem directly or indirectly, and so on.
The analysis yields three main results. First, as the actors become more patient, the ongoing cost the agent incurs by resolving the principal’s problem increases. As a result, the maximal effort the principal can induce the agent to exert is very small and goes to zero in the limit as the actors become very patient. Second, the expected duration of the principal’s problem, which is inversely related to the level of effort, becomes very long and goes to infinity in the limit. In short, the problem persists. Third, the principal nevertheless strictly prefers to work through the agent rather than deal with the problem directly.
The next section discusses related work. The model is introduced and analyzed in the subsequent sections. There follows a brief discussion of some empirical cases and extensions: US–Pakistan relations, foreign aid and democratization, and arms control.
RELATED WORK
The present study is most directly related to two existing bodies of literature. The first uses a principal–agent approach to analyze patron–client relations, some aspects of foreign aid, and proxy wars.Footnote 6 The second centers on commitment problems and inefficiently costly conflict (Fearon Reference Fearon1995; Powell Reference Powell2006).
The main focus of the former is on the relation between a dominate actor, like the United States, and its efforts to get a much less powerful actor, usually a weaker state’s incumbent regime, to undertake actions furthering the dominant actor’s interests.Footnote 7 These analyses typically emphasize the divergence between the principal’s and the agent’s goals, the moral hazard problem associated with the principal’s limited ability to monitor the agent’s actions, and the actors’ limited ability to commit to abiding by the implicit contract between them. Broadly, the greater the divergence of interests, the more costly it is for the principal to induce the agent to act in ways that benefit the principal (Berman et al. Reference Berman, Lake, Padró-i-Miquel, Yared, Berman and Lake2019, 4–5). A common difference is over military or economic reform. These often appear to the principal, frequently the United States, to be important steps needed to counter an insurgency. By contrast, the agent may put “a premium on continuing the domestic social and economic arrangements that benefit its core supporters, even if these same measures are driving support for the insurgency” (Ladwig Reference Ladwig2016, 103). The model formalizes this conflict in terms of the cost the agent incurs if it resolves the principal’s problem. The higher the agent’s cost, the greater the divergence of interests.
A key point of much of the qualitative work employing the principal–agent framework is that historical experience shows that the principal in many cases cannot monitor the agent’s actions very well and has to rely on imperfect or noisy indicators. This moral-hazard problem gives the agent scope for turning the principal’s aid to its own ends. This in turn raises the cost and limits the ability of the principal to influence the agent (e.g., Biddle Reference Biddle2017). For example, American officials in Islamabad worried in 2008 that as much as seventy percent of American aid had been misspent (Walsh Reference Walsh2008; Coll Reference Coll2018, 151–2).
The two most closely related formal principal–agent analyses are by Nicholson (Reference Nicholson2018) and Padró-i-Miquel and Yared (Reference Padró-i-Miquel and Yared2012). Nicholson studies the moral hazard problem in an infinite-horizon model in which an agent can exert high or low levels of effort against an insurgency in each period. A high level of effort does two things. As in a standard principal–agent model, higher effort reduces the probability of a successful attack more than lower effort. But higher effort in Nicholson’s model may also succeed in eliminating the insurgents. The principal only has a noisy indicator of the effort the agent exerts in any period, and, as a result, the agent receives an informational rent for as long as the insurgency lasts. The agent loses these rents once the insurgents are eliminated as the principal no longer needs to induce any further effort. Nicholson assumes that the actors can commit to one-period-long contracts and solves for the Markov perfect equilibria of the game. The potential loss of the agent’s informational rents undermines the principal’s attempt to get the agent to exert a high level of effort.Footnote 8
Padró-i-Miquel and Yared focus on the actors’ limited ability to commit to the implicit contract between them and consequently on the need for that contract to be self-enforcing. The baseline principal–agent problem is a one-shot game in which the principal offers a contract to the agent. The contract specifies how the agent’s payments will vary with its actions, e.g., on the amount of effort the agent exerts. In the case of imperfect monitoring, the contract specifies how the payments will vary with some observable noisy measure of the agent’s actions. The agent then decides what to do and is paid accordingly. This baseline setup assumes that the principal can somehow commit to honoring the contract and paying the agent.
The assumption that either actor can commit to future actions is problematic for weakly institutionalized settings, and Padró-i-Miquel and Yared relax it. They study a repeated moral-hazard problem in which the principal tries to induce the agent to exert effort and the principal gets a noisy signal about how much effort the agent actually exerted. Neither actor can commit to future actions. Unlike most analyses that limit attention to Markov perfect equilibria, Padró-i-Miquel and Yared characterize the efficient subgame perfect equilibria. A striking result is that these equilibria exhibit cycles of punishment along the equilibrium path.
The present study complements existing work in three ways. First, the principal can perfectly monitor the agent’s effort in the model developed below. Assuming away the moral-hazard problem greatly simplifies the analysis. More importantly, it shows that the incentive problem highlighted here is fundamentally different than the contracting problems due to moral hazard or adverse selection. The simplification also makes it easier to see that this incentive problem is present in a range of substantively different settings.
Second, unlike Nicholson, the probability of resolving the principal’s problem is fully endogenous. The agent can choose any probability of resolving the problem, not just the exogenous probability associated with high-effort as in Nicholson. Full endogeneity allows a more general treatment of the factors affecting effort and duration.
Third, the actors’ inability to commit and the role of future threats in supporting cooperation in weakly institutionalized settings is a central issue here as it is in Padró-i-Miquel and Yared. Like Padró-i-Miquel and Yared, the present analysis focuses on efficient equilibria and not Markov perfect equilibria.Footnote 9 The key difference is that they study a repeated moral-hazard problem for which a repeated-game framework is appropriate. However, the interaction at issue here is not well modeled as a repeated game. An exit option is needed to capture the idea that the actors can take actions that fundamentally alter the nature of their future interaction, e.g., the principal can decide to deal with the problem directly or the agent succeeds in dealing with the principal’s problem.
This paper is also related to a second body of literature, namely, the work on commitment problems and inefficiently costly conflict (Fearon Reference Fearon1995; Powell Reference Powell2006). Commitment problems typically arise when acting efficiently adversely affects an actor’s bargaining power.Footnote 10 Most simply, an actor’s decision to make the pie larger also weakens its bargaining position and hence reduces its expected share of the pie. This creates a tradeoff. Acting efficiently yields a smaller share of a larger pie; acting inefficiently brings a larger share of a smaller pie. If the adverse shift in power swamps the efficiency gains, the actor will not expand the pie and everyone will be worse off than they could have been had they been able to commit to a division of the pie. In the present setup, solving the principal’s problem reduces the agent’s bargaining power by limiting the future punishment the agent can impose on the principal. More specifically, the game satisfies the inefficiency condition in Powell (Reference Powell2004), which ensures that no efficient equilibria exist. The focus here, however, is on characterizing the least inefficient equilibria.
THE MODEL
The model formalizes the incentive problem facing a principal trying to get an agent to resolve a problem for the principal that, if resolved, creates an ongoing cost for the agent. At the start of each period, the principal can make a transfer to the agent in order to induce the agent to exert effort on the principal’s behalf. The agent pockets the transfer and then decides how much effort to exert. Neither actor can commit to future actions.
The principal pays a per-period cost of u for as long as the problem remains unresolved. In the case of terrorism, u would be the expected per-period cost from terrorist attacks. In the case of promoting democratic reform, u would be a state’s or donor’s disutility of the nondemocratic state remaining so. For arms control, it would be the cost to the principal of the agent’s pursuing an unconstrained nuclear weapons program.
Resolving the problem relieves the principal of these costs but imposes an ongoing per-period cost of a > 0 on the agent. There are several ways to interpret the reduced-form parameter a. If, for example, the agent’s payoff to dealing with the problem (or not dealing with it) as it sees fit is normalized to zero, e.g., Pakistan’s cost of dealing with the Taliban in its preferred way is normalized to zero, then a is the additional cost the agent incurs if it deals with the problem in the principal’s preferred way. This would be the cost of creating an existential threat in the narrative quoted above. Another way to interpret a in the case of regime reform is as the reduction in the rent from holding office resulting from having democratized or reformed the military. In an arms control context, a is the disutility a state gets when abiding by constraints on its nuclear program. The larger a, the more divergent the actors’ interests.
More formally, the game is an infinite-horizon stochastic game in which the principal gets a flow of benefits of y in every period and pays u for as long as the problem is unresolved. If the problem remains unresolved at the start of round t = 0, 1, 2, …, the principal decides whether to deal with the problem directly or to try to work indirectly through the agent (see Figure 1). Both actors share a common discount factor of β ∈ [0, 1).Footnote 11
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_fig1g.gif?pub-status=live)
FIGURE 1. The Stage Game at Time t
The game effectively ends if the principal takes matters into its own hands. The principal gets a payoff of (y − p)/(1 − β), where p > 0 is the (time-average) cost of dealing with the problem directly. The agent gets −d/(1 − β), where d is agent’s cost if the principal deals directly with the problem without any effort or cooperation from the agent.Footnote 12
If the principal decides to work through the agent, it does so by making a transfer of ${x_t} \in \left[ {0,\hat{x}} \right]$ to the agent. We assume that the principal cannot transfer more than its per-period flow of benefits, i.e.,
$\hat{x} \le y$.Footnote 13 The agent pockets the gain and then chooses a level of effort e t ∈ [0, 1], which is the probability that the problem will be resolved in that round. By assumption, the agent can exert this effort costlessly.Footnote 14 With probability e t, the principal’s problem is resolved, the game ends, and the principal and agent get y/(1 − β) − x t and x t − a/(1 − β) respectively. These payoffs model in a reduced-form way the idea that the principal stops making transfers once the problem is resolved. Since the principal is no longer paying u and not making transfers, its per-period payoff is y. The agent, however, starts incurring a per-period cost of a. The problem remains unresolved with probability 1 − e t, and play moves on to period t + 1. The principal’s and agent’s payoffs for period t are y − x t − u and x t respectively.
As for the actors’ preference orderings, we assume a > d ≥ 0. That is, the worst outcome with the highest cost for the agent is for the agent to do the principal’s bidding by resolving the problem. The best outcome (with a normalized payoff of zero) is for the principal not to intervene and the agent not to exert any effort on the principal’s behalf. Direct intervention is between these alternatives.
The best outcome for the principal is for the agent to resolve the problem, i.e., the costs of taking matters into its own hands or living with the problem are both positive, i.e., min{p, u} > 0. If p < u, the principal prefers direct intervention to living with the problem. If u < p, the principal prefers the opposite. Both cases are analyzed below.
This setup differs from a standard principal–agent model in two important ways. First and foremost, the incentive problem at issue here arises in weakly institutionalized environments in which the actors cannot commit to future actions. The transfer at the start of a round is not contractually contingent—or, at least, not in any enforceable way—on the agent’s subsequent effort. The agent gets the transfer regardless of the effort it exerts. As a result, the agent exerts zero effort in the unique subgame perfect equilibrium of the stage game. The only incentive to exert effort is the promise of future transfers which are contingent. But these promises must be self-enforcing given the actors’ inability to commit.
The second difference is that although moral hazard and adverse selection are likely to complicate the principal’s ability to induce the agent to exert effort, the fundamental incentive problem at issue here exists even if neither of these contracting problems is present. To make this point and keep things simple, we abstract away from these problems. The principal knows the agent’s type and observes how much effort it exerts.
Assuming away the contracting frictions of moral hazard and adverse selection attenuates the link between the present setup and the standard contracting literature (see Bolton and Dewatripont (Reference Bolton and Dewatripont2005) for an overview). It is, nevertheless, useful to frame the stage game in terms of the interaction between a principal and an agent in which the former is trying to induce the latter to exert effort on its behalf. This framing facilitates a comparison of the present analysis with the growing literature using a principal–agent approach.
EFFICIENCY, COMMITMENT, AND THE BARGAINING SURPLUS
The analysis focuses on the case in which the efficient outcome is for the agent to resolve the problem. But resolving the principal’s problem is not in the agent’s own self-interest absent sufficient inducement from the principal. The principal, however, cannot commit to rewarding the agent in the future.
Assumption 1 (Efficiency). The least costly course of action is for the agent to resolve the problem, a < min{p + d, u}.
Taking a < p + d ensures that if the actors take the problem on, the lowest-cost way of doing so is for the agent to resolve it. Inequality u > a means the cost of living with the problem is higher than the cost of dealing with it.
Assumption 1 implies that the efficient outcome is for the agent to resolve the problem at the outset of the game by exerting effort e 0 = 1. This minimizes the deadweight loss and thereby maximizes the “pie” to be divided. The total loss when e 0 = 1 is a/(1 − β) and the total benefits are (y − a)/(1 − β).
Were it possible for the principal and agent to commit to an agreement, they would agree to maximize the total benefits by having the agent solve the problem as quickly as possible (e 0 = 1) and settle on some division of the maximized benefits. Since the principal can assure itself of a payoff of (y − p)/(1 − β) by dealing with the problem directly, it would never accept an agreement giving it less than this. Similarly, the agent would never agree to a settlement offering less than −d/(1 − β), because it can assure itself of at least this amount by exerting no effort. The bargaining surplus is the total that can be realized less the sum of what each actor can assure itself or (y − a)/(1 − β) − [(y − p)/(1 − β) − d/(1 − β)] = (p + d − a)/(1 − β) > 0.
The equilibrium analysis below shows that the actors’ inability to commit prevents them from realizing all of the surplus and that the problem persists for a very long time when the actors are patient. There are two cases two consider as the principal may or may not prefer dealing with problem directly to living with it. We first analyze the case in which the principal prefers dealing with the problem to living with it.
Assumption 2. It is less costly for the principal to deal with the problem directly than to let it continue indefinitely, p < u.
THE PAYOFF-MAXIMIZING EQUILIBRIA
This section characterizes the subgame perfect equilibria that maximize the actors’ total payoff subject to the actors’ limited ability to commit. We focus on payoff-maximizing equilibria for two reasons. First, maximizing the total payoff is equivalent to maximizing the bargaining surplus. Second, the efficiency loss due to limited commitment in these equilibria provides a lower bound on the efficiency loss in any equilibrium.
We simplify matters by focusing on pure-strategy stationary equilibrium paths along which the principal transfers the same amount in every period and the agent exerts the same level of effort in every period for as long as the problem remains unresolved.Footnote 15 It is important to emphasize that while the paths are stationary, the equilibria will generally not be. Deviations from the path can be punished. Indeed the threat of future punishment is essential to inducing any effort at all. Absent a future threat, the agent would simply pocket the transfer. Anticipating this, the principal would deal directly with the problem instead of working through the agent. More formally, the unique Markov perfect equilibrium given the actors’ inability to commit is for the agent to exert zero effort in every round and the principal to deal directly with the problem.
The actors’ payoffs and the duration of the principal’s problem along a stationary path are easy to describe. Let s = (x, e) denote a stationary path, where x is the transfer to the agent and e is the effort the agent exerts in each period for as long as the problem remains unresolved. Then the expected duration of the problem is just D(s) = 1/e. The higher the effort e, the faster (in expectation) the principal’s problem is resolved.
As for the payoffs, let V P(s) and V A(s) be the principal’s and agent’s payoffs. These continuation payoffs satisfy simple recursive relations:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ1.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ2.gif?pub-status=live)
The first term in the agent’s payoff is its payoff if it is “successful” weighted by the probability of success e. The second term is the agent’s payoff to getting the transfer plus the discounted payoff to following s weighted by the probability that the agent does not resolve the problem. Similarly, the first term of the principal’s payoff is its payoff if the agent is successful weighted by the probability of success. The second term is the per-period payoff if the problem is not resolved plus the discounted payoff of continuing to follow s. Solving for the payoffs gives:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqn1.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqn2.gif?pub-status=live)
The first step in finding the payoff-maximizing equilibria is describing the set of feasible subgame perfect equilibrium paths. Necessary and sufficient conditions for s to be an equilibrium path follow directly from the expressions for V A(s) and V P(s) and three additional observations. First, the agent can hold the principal down to (y − p)/(1 − β) by exerting zero effort in every period. The principal can hold the agent down to −d/(1 − β) by dealing directly with the problem. These are the actors’ minmax payoffs for the infinite-horizon stochastic game. The principal and agent must therefore do at least this well in any equilibrium. If, for example, V P(s) < (y − p)/(1 − β), then the principal could profitably deviate from s by taking matters into its own hands. The existence of a profitable deviation means that s cannot be an equilibrium path.
The second observation is that the strategy profile in which the principal takes matters into its own hands in every round and the agent exerts zero effort in every round is a subgame perfect equilibrium. To establish this, note that dealing with the problem directly is clearly a best response to the agent’s exerting zero effort in the current and all future rounds given that the principal prefers dealing with the problem directly to letting it continue (p < u). As for the agent, exerting zero effort in every round is a best response if the agent expects the principal to deal with the problem directly. Call this equilibrium M (for minmax) and note that each actor’s equilibrium payoff is its minmax payoff. It follows that M holds each actor down to its lowest possible subgame perfect equilibrium payoff because each actor must do at least as well as its minmax payoff in any subgame perfect equilibrium.
Third, because M holds each actor down to its lowest possible subgame perfect equilibrium payoff, the harshest possible punishment that can be imposed on an actor that deviates from a particular path is that both players will immediately switch to playing M. This leads to the following implication which is a direct result of Abreu’s (Reference Abreu1988) analysis.
Observation 1. The path s is an equilibrium path if and only if neither the principal nor the agent can profitably deviate from s when a deviation would trigger an immediate switch to playing M. Footnote 16
To determine the set of feasible equilibrium paths, i.e., those from which there is no profitable deviation, observe that if the principal were to deviate from offering x in some round, the agent would then start playing M and exert zero effort in that and all future rounds. It follows that the most profitable way for the principal to deviate is to transfer zero and deal with the problem directly. This effectively ends the game and gives the principal a payoff of (y − p)/(1 − β). As a result, sticking with the path does at least as well as deviating whenever the principal’s incentive constraint V P(s) ≥ (y − p)/(1 − β) is satisfied. Substituting the expression above for V P(s) and solving for effort gives
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ3.gif?pub-status=live)
In words, the principal is willing to offer x only if it induces effort of at least ${\underline{e}}\left( x \right)$. Otherwise it prefers to transfer zero and deal with the problem directly.
Were the agent to deviate, it would simply pocket that period’s transfer and exert no effort in the current and all future rounds. This triggers the principal to take matters into its own hands in the next round and results in a payoff of x − βd/(1 − β) for the agent. Sticking with the path does at least as well as deviating when the agent’s incentive constraint V A(s) ≥ x − βd/(1 − β) holds. Solving for effort yields
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ4.gif?pub-status=live)
In brief, the agent is willing to exert at most effort $\bar{e}\left( x \right)$ in return for a transfer of x.
A stationary equilibrium must satisfy the two incentive constraints ${\underline{e}}\left( x \right) \le \,e \le \,\bar{e}\left( x \right)$ as well as the transfer constraint
$x \,\le \,\hat{x}$. These constraints and the stationary paths satisfying them are illustrated in Figure 2. There are two cases. Let (x +, e +) denote the high-effort intersection of the incentive constraints. (The expressions for x + and e + are derived below.) Then the transfer constraint is slack when
${x^ + } \,\le \,\hat{x}$ as in Figure 2a. It binds when
${x^ + } > \hat{x}$ as shown in Figure 2b.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_fig2g.jpeg?pub-status=live)
FIGURE 2. The Feasible Stationary Equilibrium Paths
The unique payoff-maximizing stationary path follows immediately. The total payoff along s is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqn3.gif?pub-status=live)
This payoff is independent of the transfer and increasing in effort. Intuitively, the transfers have no direct effect on the total payoff whereas higher effort means that the problem is resolved more quickly (in expectation) and thus at lower expected cost. Because the total payoff is increasing in effort, the payoff-maximizing path s* = (x*, e*) is the feasible path with the highest effort. This is s* = (x +, e +) when the transfer constraint is slack and ${s^*} = \left( {\hat{x},\bar{e}\left( {\hat{x}} \right)} \right)$ when it binds.Footnote 17 More compactly,
${x^*} = \min \left\{ {{x^ + },\hat{x}} \right\}$ and
${e^*} = \min \left\{ {{e^ + },\hat{e}\left( {\hat{x}} \right)} \right\}$, where
$\hat{e} \,\equiv \,\bar{e}\left( {\hat{x}} \right)$.Footnote 18
Observe that the agent’s incentive constraint is sure to bind at s*, i.e., V A(s*) = x* − βd/(1 − β) or equivalently $e = \bar{e}\left( {{x^*}} \right)$. The key intuition is that this constraint defines the maximal effort that the agent can be induced to exert in return for a transfer of a given size, and maximizing the agent’s effort maximizes the total payoff. As for the principal, the more it can credibly promise to transfer, the more effort it can induce the agent to exert. The principal, however, cannot credibly promise the agent so much that the principal’s payoff V P(s) falls below its minmax payoff. It follows that either the principal’s incentive constraint must bind at s* as in Figure 2a or the transfer constrain must bind as in Figure 2b.
To obtain explicit expressions for x + and e +, use equations (1) and (2) and the fact that the principal’s and agent’s incentive constraints bind at s + = (x +, e +). Solving V A(s) = x − βd/(1 − β) and V P(s) = (y − p)/(1 − β) for the transfer gives a quadratic in x. The larger root and corresponding expression for e are:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqn4.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqn5.gif?pub-status=live)
The transfer x + is sure to be a positive real number and e + a well-defined probability between zero and one when β is high enough.
PERSISTENCE
The three main results follow immediately. First, the more the actors care about the future, the higher the cost to the agent of resolving the principal’s problem, i.e., the higher a/(1 − β). As this cost rises, the principal has to transfer more and more to the agent in order to induce the agent to exert any given level of effort. As a result, the unconstrained payoff-maximizing transfer x + goes to infinity as the discount factor goes to one, and the transfer constraint eventually binds with ${x^*} = \hat{x}$ and
${e^*} = \hat{e}$. As the actors become still more patient and the agent’s cost grows still larger, the amount of effort the agent is willing to exert in return for the maximal transfer goes down. As the actors become very patient, the maximal effort the principal can induce the agent to exert goes to zero,
$\mathop {\lim }\nolimits_{\beta \to 1} \hat{e} = 0$.
Second, the principal’s problem becomes intractable as the actors become very patient. Recall that e* is an upper bound on the level of effort along any stationary equilibrium path. As a result, the expected duration along s*, ${1 / {\hat{e}}}$, is a lower bound on the expected duration of the principal’s problem along any stationary-path equilibrium. This lower bound goes to infinity as the discount factor goes to one.
Third, even though the agent becomes less and less likely to resolve the principal’s problem, the principal still prefers working through the agent rather than dealing with the problem directly. To see this, note that the principal’s incentive constraint is slack at s* once ${x^ + } > \hat{x}$. This means that the principal’s payoff to working through the agent is strictly greater than its payoff dealing with the problem directly.Footnote 19 Proposition 1 summarizes these results.
Proposition 1 (Persistence). As the principal and agent become increasingly patient, the maximal effort the principal can induce the agent to exert goes to zero and the expected duration of the principal’s problem goes to infinity: $\mathop {\lim }\nolimits_{\beta \to 1} {e^*} = 0$ and
$\mathop {\lim }\nolimits_{\beta \to 1} {1 / {{e^*}}} = \infty$. Nevertheless, the principal strictly prefers working through the agent to dealing with the problem directly. Footnote 20
Other comparative statics follow from the expressions for x +, e +, and $\hat{e}$. As the agent’s cost a of dealing with the problem increases, the agent is less willing to exert effort in return for a given transfer and
$\bar{e}\left( x \right)$ shifts down in Figure 2. Both the transfer x + and the effort e + decrease. By contrast, the higher the agent’s cost if the principal acts directly, the more effort the agent is willing to exert in return for a given transfer. The maximal effort
$\bar{e}\left( x \right)$ shifts up, and both the transfer x + and the effort e + increase as d increases.
An increase in the principal’s direct cost p lowers the principal’s minmax payoff and increases the amount the principal can credibly promise to the agent. This shifts ${\underline{e}}\left( x \right)$ down and leads to a higher transfer and more effort (∂x +/∂p > 0 and ∂e +/∂p > 0). It is clear from the expressions for x + and e + that an increase in the principal’s cost of living with the problem u leads to a smaller transfer and less effort.
Focusing on the expected duration of the principal’s problem: The higher the agent’s cost to resolving the principal’s problem, the harder it is for the principal to induce effort and the longer the problem persists. By contrast, the higher the agent’s cost if the principal acts directly, the more effort the agent is willing to exert and the faster the problem is resolved. The higher the principal’s cost of living with the problem, the (weakly) longer it takes to resolve the problem (∂e +/∂u < 0 and $\left. {{{\partial \hat{e}} / {\partial u}} = 0} \right)$. The higher the principal’s cost to dealing with the problem directly, the (weakly) faster the problem will be resolved (∂e +/∂p < 0 and
$\left. {{{\partial \hat{e}} / {\partial p}} = 0} \right)$.
LIVING WITH THE PROBLEM
The analysis and results are similar when the principal prefers living with the problem to dealing with it directly. Replace Assumption 2 with.
Assumption 2′. The principal prefers living with the problem to dealing with it directly, u < p.
The principal’s minmax payoff is now the payoff to living with the problem (y − u)/(1 − β) rather than the payoff to dealing directly with the problem (y − p)/(1 − β). Indeed, intervening directly is strictly dominated by living with the problem. It follows that the principal cannot intervene directly with positive probability at any information set in a subgame perfect equilibrium. As a result, the agent’s smallest subgame perfect equilibrium payoff is zero.
Observe further that the strategy profile in which the principal always transfers zero and the agent always exerts zero effort is subgame perfect and yields these payoffs. Call this equilibrium M′. Then the we have a parallel to Observation 1: When the principal prefers living with the problem, a path s is a subgame perfect equilibrium path if and only if neither the principal nor the agent can profitably deviate from it if deviation triggers an immediate switch to M′.
When the principal prefers living with the problem, the incentive constraints for not being able to profitably deviate are V P(s) ≥ (y − u)/(1 − β) and V A(s) ≥ x. Repeating the derivation above gives
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ5.gif?pub-status=live)
with the payoff-maximizing stationary path s* = (x*, e*), where ${x^*} = \min \left\{ {x^+, \hat{x}} \right\}$ and
${e^*} = \min \left\{ {{e^ + },\hat{e}} \right\}$.
As before, the transfer x + becomes unboundedly large as the actors become very patient. This implies that the transfer constraint eventually binds and that the payoff-maximizing path is $\left( {\hat{x},\hat{e}} \right)$. The maximal effort again goes to zero as the actors become very patient (
$\hat{e} \to 0$ as β → 1), and the problem becomes intractable.
BOUNDED TRANSFERS AND COMMITMENT PROBLEMS
The model assumes that the amount the principal can transfer to the agent is bounded which would seem to be the most natural assumption.Footnote 21 This assumption is also closely related to the commitment problems at issue here and in other analyses of costly conflict. This section first describes the results when there is no transfer constraint and then discusses the related commitment issues.
If there is no transfer constraint, the payoff-maximizing path is s* = (x +, e +). As observed above, the transfer x + becomes unboundedly large as the actors become very patient. Even so, the principal still cannot induce the agent to exert maximal effort of e + = 1 when there is no limit to what the principal can offer. The maximal effort e + increases and converges to 1 − (a − d)/p as the actors become very patient (see equation (5)). Nevertheless, the efficiency loss (as a fraction of total benefits) goes to zero, and the actors capture essentially all of the bargaining surplus.
More formally, the efficiency loss is the maximum possible payoff less the actual payoff or L ≡ (y − a)/(1 − β) − V(s*). Recalling that both incentive constraints bind at (x +, e +) gives V(s*) = V A(s*) + V P(s*) = x + + (y − p − βd)/(1 − β). Assuming the principal prefers direct action to living with the problem forever (p < u), substituting the expression in equation (4) for x +, and some algebra show that L as a fraction of total benefits (y − a)/(1 − β) goes to zero as the actors become very patient.
The simplicity of the present model highlights the role that the transfer constraint plays in the efficiency loss. While perhaps less explicit, an upper bound on the size of the transfer is also essential to many other models of costly conflict with complete information. Examples include Acemoglu and Robinson’s (Reference Acemoglu and Robinson2000, Reference Acemoglu and Robinson2001, Reference Acemoglu and Robinson2006) models of democratization, Fearon’s (Reference Fearon2004) model of long civil wars, and Powell’s (Reference Powell2004, Reference Powell2006) analysis of commitment problems.Footnote 22 Revolution and war and the associated efficiency losses in these models are the result of a “liquidity” problem (Powell Reference Powell2004, Reference Powell, Skaperdis and Garfinkel2012). Because an actor can only transfer a limited amount to its adversary, it must rely on promises of future payment in order to be able to offer enough to an adversary to induce it not to fight. But shifting power undermines the credibility of these future payments. Were there no upper bound, the actor would be able to offer enough to remain in power and avoid fighting.
Bounded transfers and limited commitment power complement each other in these models. The actors could avoid inefficient conflict if they were able to commit to future payments or if they could make sufficiently large transfers. Note that the latter effectively substitutes one commitment issue for another. The actor making a transfer exceeding today’s “pie” presumably has to borrow from some (unmodeled) lenders and somehow credibly commit to repaying them. In the weakly institutionalized settings studied here, the actors have limited commitment power and cannot make unboundedly large transfers.
CASES AND EXTENSIONS
Relations between Pakistan and the United States, especially after the attack of September 11, 2001, illustrate the incentive problem analyzed above in two ways.Footnote 23 First, American aid has been highly contingent as transfers are in the model. Aid flowed when the United States needed Pakistan’s support and stopped when it did not. Figure 3 shows the pattern of American aid.Footnote 24
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_fig3g.jpeg?pub-status=live)
FIGURE 3. US Aid to Pakistan, 1970–2016
Aid to Pakistan decreased during the 1970s as the United States became increasingly wary of Pakistan’s nuclear ambitions. In early 1979, the United States concluded that Pakistan was covertly attempting to develop a nuclear weapon through enriched uranium (Thornton Reference Thornton1982; Kux Reference Kux2001, 238–42). Aid for that year, consisting almost entirely of economic assistance, fell to a low of $141 million.
The situation changed “overnight, literally” when Soviet forces moved into Afghanistan in December 1979. “Pakistan, now a front-line state, became an essential line of defense and an indispensable element of any strategy that sought to punish the Soviets for their action” (Thornton Reference Thornton1982, 969). In return for substantial American economic and military aid, peaking at almost $1.3 billion in 1988, Pakistan provided the primary channel for funneling support to the mujahideen resistance.
On October 1, 1990, less than a year after the Soviet Union withdrew from Afghanistan, the United States suspended aid to Pakistan when concerns about Pakistan’s nuclear program could again come to the fore. The “glue of the Cold War and common struggle against the Soviet occupation of Afghanistan no longer cemented US–Pakistani ties… Pakistan had not only lost any strategic importance but had become a nuclear troublemaker and source of regional instability” (Kux Reference Kux2001, 320). Economic assistance already in the pipeline continued but subsequently fell to under $35 million in 1992 as the pipeline ran dry.
Aid remained at very low levels until 9–11. Pakistan again became critical to American efforts to deal with a pressing problem. Aid once again began to flow and averaged over $2.3 billion a year for the rest of the decade.
In brief, the pattern of US aid to Pakistan over the last four decades illustrates the instrumental role that it played. It flowed when there was a pressing problem with which the United States needed help. It stopped when the problem was resolved. Pakistan was an “occasional transactional partner” of the United States (Haqqani Reference Haqqani2013, 339). More bluntly, Pakistani ambassador to the United States Abida Hussein is reported to have said shortly after taking up her post in 1992 that the United States “had about as much interest in Pakistan as Pakistan had in the Maldives.”Footnote 25 This pattern is in keeping with Morgenthau’s broad characterization of the nature of much foreign assistance: “The transfer of money and services from one government to another performs here the function of a price paid for political services rendered or to be rendered” (Reference Morgenthau1962, 302).
The second way that US–Pakistani relations illustrates the incentive problem is that the United States was generally more able to induce Pakistan to exert higher levels of effort when American and Pakistani goals were less divergent in the 1980s. Their primary goal then was supporting the mujahideen resistance and compelling the Soviet Union to withdraw its forces. The United States by contrast was much less able to induce Pakistan “to go in full throttle against the Taliban” when American and Pakistani goals were more divergent.
The United States and Pakistan both wanted the Soviet Union out of Afghanistan. To that end, both wanted to “grow the war” as Central Intelligence Director William Casey put it to the Islamabad station chief in 1981 (Kux Reference Kux2001, 261–2). Pakistan did, however, insist on controlling the distribution of American support, and Pakistan’s Inter-Services Intelligence Directorate (ISI) used that control to channel most of the aid to its preferred fundamentalist factions of the mujahideen (Kux Reference Kux2001; Markey Reference Markey2013, 92–3). Even so, substantial aid was funneled to the mujahideen. The mujahideen in turn proved to be an effective force against the Soviets, especially after the United States began supplying Stinger portable anti-aircraft missiles to use against Soviet attack helicopters (Cordovez and Harrison Reference Cordovez and Harrison1995).
By contrast, there was a significant divergence of interests regarding Afghanistan in the years after the 9–11 attacks, especially after the Taliban regrouped and American attention began to shift back from Iraq to Afghanistan. Washington by that time had come to see the Taliban’s safe havens in Pakistan as a significant problem and pressed Pakistan to go after them with little success. In 2011, Chairman of the Joint Chiefs Admiral Mike Mullen called the Haqqani network “a veritable arm of Pakistan’s Inter-Services Intelligence agency” (Bumiller and Perlez Reference Bumiller and Perlez2011). Markey’s summary of the situation echoes Crocker’s portrayal:
At the core of the dispute was Pakistan’s approach to territories like North Waziristan along the border with Afghanistan, where Taliban insurgent leaders continued to find safe haven after years of war. Washington wanted Pakistan to cut off the head of the snake that was biting NATO and Afghan forces, but Pakistan was unwilling to sever ties with the Haqqani network or Mullah Omar’s Afghan Taliban… Sooner or later, the Pakistanis figured, whatever fragile edifice Washington constructed in Afghanistan would collapse. If Afghanistan fell apart after America’s withdrawal and Islamabad had already turned against the Afghan Taliban, what friends (and more importantly, what influence) would Pakistan have left there (Reference Markey2013, 163)?
In short, dealing with the American problem would likely create a larger ongoing problem for Pakistan. In keeping with the model, the United States was unable to induce Pakistan to exert high levels of effort to deal with the American problem.
The incentive problem highlighted here also seems likely to plague the use of development aid to promote democracy or, more broadly, regime reform. Empirical efforts to assess the extent to which foreign aid can be used to promote democracy have reached different conclusions (Kersting and Kilby Reference Kersting and Kilby2014). But Wright (Reference Wright2009) finds significant support for the hypothesis that the promise of future aid contingent on democratization is more likely to induce authoritarian leaders to democratize when they are more likely to prevail in future elections.
[D]ictators who stand little chance of surviving liberalization will not be swayed by promises of aid, but dictators who are likely to remain in power even if they liberalize may view the promise of future aid as an incentive to democratize. The effect of aid on democratization, therefore, will vary by factors that increase the chances of a dictator surviving political liberalization intact (Reference Wright2009, 552).
Democracy assistance can pose the kind of incentive problem studied here. The agent is the authoritarian regime, and the principal is a potential donor trying to induce the agent to democratize. Solving the donor’s “problem” by democratizing creates an ongoing problem for the authoritarian, namely, the potential loss of future rents from holding office. In terms of the model, the donor suffers a disutility of u as long as the state remains authoritarian. There is also no meaningful way for the donor to take matters into its own hands, so u < p. The authoritarian’s rents from holding office are normalized to zero and the expected loss from democratizing is a. As to be expected from the model, it is difficult to induce significant reform in these circumstances.
A similar incentive problem may bedevil negotiating nuclear arms control agreements with long breakout periods. The breakout time is how long it would take a state to produce enough fissile material (plutonium or highly enriched uranium) to make a nuclear bomb. The longer the period, the more time there is to detect the attempted breakout and react to it. Extending Iran’s breakout time from an estimated 2–3 months to a year was a key American objective in the negotiations that ultimately led to the Joint Comprehensive Plan of Action (JCPOA).Footnote 26
The United States has also sought the “complete, verifiable, irreversible, dismantlement” of North Korea’s nuclear program since President George W. Bush’s administration. It is not completely clear how a state could actually irreversibly dismantle its nuclear program. But one can think of it conceptually as creating a very long or infinite breakout period.Footnote 27
At their most basic level, nuclear arms control agreements generally entail reciprocal concessions buttressed by implicit threats. Should either party fail to live up to its part of the agreement, the other party will not uphold its part. Agreements are feasible when the long run costs of reneging outweigh the short run benefits. Long breakout periods limit a state’s ability to impose costs on its counterparty should the latter renege on its part of the agreement.
To formalize this issue, consider a modified version of the game. As before, the principal starts every round by making a transfer to the agent. The transfer is a net gain which could take the form of an actual transfer of, say, oil as was part of the 1994 Agreed Framework with North Korea, or it could take the form of sanction relief as with the JCPOA. The agent only has three alternatives. It can continue its nuclear program (e t = 0) which imposes costs u on the principal. It can irreversibly dismantle its nuclear program (e t = 1) which ends the game with the agent’s paying a/(1 − β) which are the foregone benefits of having its nuclear program. The agent can also suspend its program which relieves the principal of cost u but imposes cost a on the agent. The modified stage game is depicted in Figure 4, where we assume that the option of direct action, i.e., attacking the agent’s nuclear installations and risking a larger war is too costly (p > u).Footnote 28
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_fig4g.gif?pub-status=live)
FIGURE 4. The Arms Control Stage Game
It is straightforward to show that the principal cannot induce the agent to dismantle its program as the agent loses the ability to induce any further transfers. Formally, the agent’s payoff to dismantling at t is x t − a/(1 − β) whereas the agent can assure itself of x t in the current round and at least zero in all future rounds by continuing its program (e t+k = 0 for k ≥ 0). Although dismantlement is not an equilibrium outcome, an ongoing suspension is. The principal’s making ongoing transfers of x in return for the agent’s ongoing suspension of its nuclear program is an equilibrium path for any x ∈ [a/β, u].Footnote 29
Limited breakout periods can also be sustained in equilibrium. To see how long a breakout period can be supported as part of an agreement, suppose that the game does not end if the agent exerts effort e t = 1. Instead, a breakout period of B rounds is created. That is, the principal and agent start playing a new stage game. As before, the principal decides how much to transfer to the agent at the start of each round. The agent then decides whether to restart its program or not. If the agent does not restart, the principal and agent respectively get y − x t and x t − a for that round, and play moves on the next round. Note that the principal’s only cost is its transfer since the agent is unable to impose any costs on the principal. If the agent decides to restart its program, the principal and agent get y and −a for B rounds after which play moves to the original stage game in Figure 4.
The principal may be tempted to renege on the agreement during the breakout period because of the agent’s limited ability to impose costs on the principal. Sticking with the agreement will still be in principal’s interest as long as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ6.gif?pub-status=live)
The expression on the left is the principal’s payoff to continuing to transfer x to the agent in every round. The first term on the right is the principal’s payoff if it stops making transfers and pays no cost during the breakout period. The second term is the principal’s payoff once the breakout period ends and it begins paying u. Solving for the breakout period gives B ≤ (ln x − ln u)/ln β. The breakout period is increasing in the principal’s cost and decreasing in the agreed transfer. The larger the latter, the greater the temptation to renege. Since the minimum transfer must be at least a/β, the maximum sustainable breakout period is (ln a − ln u)/ln β − 1.
CONCLUSION
Recent work on counter-insurgency, client states, foreign aid, and proxy wars has adopted a principal–agent perspective to study the principal’s ability to induce an agent to exert effort on the principal’s behalf. This work has broadly emphasized three factors: the extent to which the principal’s goals and interests diverge from the agent’s; the severity of the moral hazard problem; and the limited ability of the principal and agent to make credible commitments. Credibility generally depends on implicit threats of future punishment. If one party fails to uphold its part of the agreement, the other party will “punish” it severely enough to make the long-run cost of reneging outweigh the short-run gains.
This paper shows that this enforcement mechanism tends to break down if the principal is trying to get the agent to resolve a problem that, if resolved, (i) creates or considerably exacerbates a problem for the agent that results in an ongoing cost and that (ii) simultaneously shifts the bargaining power in favor of the principal by eliminating or significantly reducing the agent’s ability to impose future costs on the principal. The level of effort the principal can induce the agent to exert is very small and the expected duration of the principal’s problem is very long when the actors are very patient. The principal nevertheless continues to prefer to work through the agent rather than deal with the problem directly. In the case of arms control, the analysis establishes an upper bound on the length of feasible breakout periods.
APPENDIX
This appendix shows that s* is a payoff-maximizing equilibrium path and generalizes Proposition 1 to non-stationary paths. To state the first claim more precisely, let $\pi = \left\{ {\left( {{x_t},{e_t}} \right)} \right\}_{t = 0}^\infty$ denote a path through the game where x t ≥ 0 is the transfer to the agent in round t and e t is the effort the agent exerts. The path π need not be stationary.Footnote 30 Take V P(π|t) and V A(π|t) to the principal’s and agent’s payoffs to following π starting in round t. These continuation payoffs satisfy the recursive relations:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ7.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ8.gif?pub-status=live)
Observation 1 implies that π is a subgame perfect equilibrium path if and only if neither the agent nor the principal can profitably deviate from π at any time when deviation immediately triggers M. That is, π must satisfy V A(π|t) ≥ x t − βd/(1 − β) and V P(π|t) ≥ (y − p)/(1 − β) for all t ≥ 0. Let Π be the set of feasible equilibrium paths and take $\bar{V}$ to be the maximum total payoff
$\bar{V} \,\equiv \,{\sup _{\pi \in \Pi }}\left\{ {{V\!_{\rm{<italic>A</italic>}}}\left( {\pi {\rm{|}}0} \right) + {V\!_{\rm{<italic>P</italic>}}}\left( {\pi |0} \right)} \right\}$.Footnote 31 The path π ∈ Π is surplus maximizing if
$\bar{V} = {V\!_{\rm{<italic>A</italic>}}}\left( \pi \right) + {V\!_{\rm{<italic>P</italic>}}}\left( \pi \right)$. Then:
Proposition A1. The stationary path s* is payoff maximizing: $V\left( {{s^*}} \right) = \bar{V}$.
Proof: There are two cases to consider: (i) when the transfer constraint is slack and (ii) when it binds.
Case i: ${x^ + } \le \hat{x}$. Let Π′ be the set of all paths that satisfy the two incentive constraints but not necessarily the transfer constraint and S ⊂ Π′ be the set of stationary paths satisfying the two incentive constraints. These are paths which can be sustained by the threat to play equilibrium M as described in Observation 1. Take
$\bar{V}\prime$ to be the maximum total payoff
$\bar{V}\prime \,\equiv\, {\sup _{\pi \in \Pi \prime }}\left\{ {{V\!_{\rm{<italic>A</italic>}}}\left( \pi \right) + {V\!_{\rm{<italic>P</italic>}}}\left( \pi \right)} \right\}$.
We show that $V\left( {{s^ + }} \right) = \bar{V}\prime$. The path s + also satisfies the transfer constraint when it is slack which implies that s + also maximizes the total payoff over the set of paths satisfying the incentive constraints and the transfer constraint. We start by characterizing e +.
Recall that the agent’s incentive constraint $e \,\le \,\bar{e}\left( x \right)$ and the principal’s incentive constraint
$e \,\ge \,{\underline{e}}\left( x \right)$ bind at s + = (x +, e +), where s + uniquely maximizes the surplus over the set of stationary feasible paths, i.e., arg maxs∈S V(s) is the singleton s +. It follows that s + satisfies equation (3) which can be written as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna1.gif?pub-status=live)
The fact that the agent’s incentive constraint binds at s + also gives:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna2.gif?pub-status=live)
where the left side of the first equality is the agent’s payoff to deviating by pocketing x + and exerting no effort and the right side is the payoff to following s +. We also have V P(s +) = (y − p)/(1 − β) because the principal’s incentive constraint binds. This leaves V A(s +) = V(s +) − (y − p)/(1 − β). Substituting this into (A2) gives
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna3.gif?pub-status=live)
Combining (A1) and (A3) yields a quadratic equation with the larger root corresponding to e +. (If there are no real roots, there are no feasible stationary equilibrium paths along which the agent exerts positive effort.)
We now demonstrate that e + is an upper bound on the effort exerted along any feasible path π ∈ Π′. To establish this, suppose that e t is exerted along some $\hat{\pi } \in \Pi \prime$. Then
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ9.gif?pub-status=live)
where the left side is the agent’s payoff to following $\hat{\pi }$ and the right side is the agent’s payoff to absconding with x t and thereby triggering a switch to playing M.
It follows that e t is bounded above by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna4.gif?pub-status=live)
The expression on the right is increasing in ${V\!_{\rm{<italic>A</italic>}}}\left( {\hat{\pi }{\rm{|}}t + 1} \right)$. Observe further that
${V\!_{\rm{<italic>A</italic>}}}\left( {\hat{\pi }{\rm{|}}t + 1} \right) \,\le\, \bar{V}\prime - {{\left( {y - p} \right)} / {\left( {1 - \beta } \right)}}$ because
${V\!_{\rm{<italic>A</italic>}}}\left( {\hat{\pi }{\rm{|}}t + 1} \right) + {V\!_{\rm{<italic>P</italic>}}}\left( {\hat{\pi }{\rm{|}}t + 1} \right) \,\le\, \bar{V}\prime$ and
${V\!_{\rm{<italic>P</italic>}}}\left( {\hat{\pi }{\rm{|}}t + 1} \right) \,\ge \,{{\left( {y - p} \right)} / {\left( {1 - \beta } \right)}}$. Hence,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ10.gif?pub-status=live)
where the upper bound is defined as a function of $\bar{V}\prime$. Note that ε is increasing and bounded above by one.
Define ${\varepsilon _1} = \varepsilon \left( {\bar{V}\prime } \right)$ and let V 1 be the total payoff along the path in which the agent exerts ε 1 in every period. (This path may or may not be a feasible equilibrium path.) Using the expression for the total payoff along a stationary path,`
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ11.gif?pub-status=live)
where v is increasing.
Now define ε 2 = ε(V 1) and V 2 to be the total payoff if the agent exerts ε 2 in every round, i.e., V 2 = v(ε 2). Since ${V_1} \ge \bar{V}\prime$, ε 2 ≥ ε 1 which in turn implies V 2 ≥ V 1. Continuing in this way, define the sequence
$\left\{ {{\varepsilon _j}} \right\}_{j = 1}^\infty$, where
${\varepsilon _1} = \varepsilon \left( {\bar{V\prime}} \right)$ and ε j = ε∘v(ε j−1) for j ≥ 2.
The sequence ε j is nondecreasing and bounded above, so it converges to some ε + ≥ ε j for all j ≥ 1. The limit point ε + is also a fixed point of ε∘v and therefore satisfies
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ12.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ13.gif?pub-status=live)
These equations are equivalent to equations (A1) and (A3). It follows that ε + is one of the two roots of these equations and is therefore bounded above by the larger root. That is, ε + ≤ e +. In words, the level of effort exerted in every round along s + is at least as large as the level of effort exerted along any feasible path π ∈ Π′.
It follows that the total payoff along s + must be at least as large as that along π. This implies $V\left( {{s^ + }} \right) \,\ge \,{\sup _{\pi \in \Pi \prime }}\,V\left( \pi \right) = \bar{V}\prime$. But
$\bar{V}\prime$ is by definition the largest feasible total payoff which means
$\bar{V}\prime \,\ge\, V\left( {{s^ + }} \right)$ since s + ∈ S ⊂ Π′. Hence,
$V\left( {{s^ + }} \right) = \bar{V}\prime$.
Case ii: ${x^ + } > \hat{x}$. Let Π″ denote the set of paths satisfying the agent’s incentive constraint and the transfer constraint. These are the paths π such that
${x_t} \,\le\, \hat{x}$ for all t ≥ 0 and from which the agent cannot profitably deviate when deviation triggers M. Take
$\bar{V}''$ to be the maximum total payoff, i.e.,
$\bar{V}'' \,\equiv\, {\sup _{\pi \in \Pi ''}}\left\{ {{V\!_{\rm{<italic>A</italic>}}}\left( \pi \right) + {V\!_{\rm{<italic>P</italic>}}}\left( \pi \right)} \right\}$. To simplify matters and avoid tedious limit arguments, we assume that there is a maximizer, i.e., that there is a path μ such that
$V\left( \mu \right) = \bar{V}''$.
It suffices to show that $V\left( {\hat{s}} \right) = \bar{V}''$, where recall
$\hat{s} = \left( {\hat{x},\bar{e}\left( {\hat{x}} \right)} \right)$. To see why, observe that
$\hat{s}$ also satisfies the principal’s incentive constraints since
$\hat{e} = \bar{e}\left( {\hat{x}} \right) > {\underline{e}}\left( {\hat{x}} \right)$ when
$x^+ \,\gt\, \hat{x}$. Hence,
$\hat{s}$ maximizes the total payoff over the set of paths satisfying both actors’ incentive constraints and the transfer constraint.
The first step in establishing $V\left( {\hat{s}} \right) = \bar{V}''$ is demonstrating that V(π) is increasing in e 0. To establish this, write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ14.gif?pub-status=live)
V(π|t = 1) is bounded above by (y − a)/(1 − β), so the coefficient on e 0 is positive.
It follows that the agent’s incentive constraint
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna5.gif?pub-status=live)
must bind on μ. Intuitively, increasing e 0 as much as possible increases the total payoff at t = 0 and has no adverse incentive effects at any later date. That is, V(π|t) and V A(π|t) are independent of e 0 at all t ≥ 1.
Now note that all of the transfer constraints save for t = 0 must bind along μ. To see that ${x_j} = \hat{x}$ for all j ≥ 1 along μ, assume the contrary. Then
${x_j} \,\lt\, \hat{x}$ for some j. Take π′ to be identical to μ except that
$x_j \prime = \hat{x}$ and that
$e_0 \prime$ may differ from e 0 because we continue to assume that the agent’s incentive constraint binds at t = 0 along π′. This construction yields the contradiction that V(π′) > V(μ).
To show that V(π′) > V(μ), note that the upper bound on e 0 in the agent’s incentive constraint in inequality (A5) is increasing in V A(π|t = 1) which in turn is increasing in x k for k ≥ 1. This and the fact that the agent’s incentive constraint binds at t = 0 along both μ and π′ imply $e_0\prime > {e_0}$ and, consequently, V(π′) > V(μ). The path π′ also satisfies the agent’s incentive constraints at all t because increasing x j weakly relaxes all of the agent’s incentive constraints.
Now consider any potential maximizer, i.e., any path z with ${x_k} = \hat{x}$ for all k ≥ 1. The total payoff can be written as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ15.gif?pub-status=live)
Similarly,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna6.gif?pub-status=live)
These expressions show that maximizing V(z) subject to the agent’s incentive constraints and ${x_0} \in [0,\hat{x}]$ is equivalent to minimizing V A(z) − x 0 subject to those same constraints.
Satisfying the agent’s incentive constraint at t = 0 requires V A(z) ≥ x 0 − βd/(1 − β). It follows that if the agent’s incentive constraint binds at t = 0 on z, i.e., that the previous inequality holds with equality, then z will minimize V A(z) − x 0. This implies that z is feasible and maximizes V(π) if ${x_0} \in \left[ {0,\hat{x}} \right]$ on z, the agent’s incentive constraints along z are satisfied at t ≥ 0, and the agent’s incentive constraint binds at t = 0. The path
$\hat{s}$ satisfies these conditions and therefore maximizes V(π). □
Extension of Proposition 1 to non-stationary paths. Observation 1 applies to any path, not just stationary paths. That is a path $\pi = \left\{ {\left( {{x_t},{e_t}} \right)} \right\}_{t = 0}^\infty$ is an equilibrium path if and only if neither the principal nor the agent can profitably deviate at any time when deviation immediately triggers a switch to playing equilibrium M. Now let
$\left\{ {\pi \left( {{\beta _m}} \right)} \right\}_{m = 0}^\infty$ be a sequence of equilibrium paths along which the discount factor goes to one, i.e.,
$\mathop {\lim }\nolimits_{m \to \infty } {\beta _m} = 1$. These paths may or may not be stationary. An example of such a sequence is
$\left\{ {\hat{s}\left( {{\beta _m}} \right)} \right\}_{m = 0}^\infty$, where
$\hat{s}\left( {{\beta _m}} \right) = \left( {\hat{x},\hat{e}\left( {{\beta _m}} \right)} \right)$ and
$\hat{e}\left( {{\beta _m}} \right) = {{{\beta _m}\left( {\hat{x} + d} \right)} / {\left[ {{\beta _m}\left( {1 - {\beta _m}} \right)\left( {\hat{x} + d} \right) + a - \beta_m d} \right]}}$.
In the case of non-stationary paths, the effort at a given time need not go to zero as the actors become very patient. For example, the effort at the outset, e 0(β m) may not go to zero. But inducing the agent to exert higher effort at t = 0 requires that the agent’s continuation value if the problem is not resolved, namely, V A(π(β m)|t = 1), be higher. This continuation value is decreasing in e t for t ≥ 1. So higher effort at t = 0 means lower effort in the future. The effect of this is that even though e t(β m) may not go to zero at all times along a sequence of non-stationary paths, the expected duration of the principal’s problem becomes unboundedly long as the actors become very patient. Proposition 2A formalizes this:
Proposition A2. Let $\left\{ {\pi \left( {{\beta _m}} \right)} \right\}_{m = 0}^\infty$ be any sequence of equilibrium paths along which the discount factor goes to one. Then
$\mathop {\lim }\nolimits_{m \to \infty } D\left( {\pi \left( {{\beta _m}} \right)} \right) = \infty$. If
$\left\{ {\pi \left( {{\beta _m}} \right)} \right\}_{m = 0}^\infty$ is a sequence of payoff-maximizing equilibrium paths, then the principal strictly prefers working through the agent along the sequence: V P(π(β m)) > (y − p)/(1 − β m) for all m.
Proof: Observe first that we can write the expected duration of the principal’s problem along any path as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ16.gif?pub-status=live)
In light of equation (A6), the agent’s payoff to path $\pi \left( {{\beta _m}} \right) = \left\{ {{x_t}\left( {{\beta _m}} \right),{e_t}\left( {{\beta _m}} \right)} \right\}_{t = 0}^\infty$ can be bounded as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ17.gif?pub-status=live)
Noting that $\sum\limits_{j = 0}^\infty {{\beta ^j}} \prod\limits_{k = 0}^j {\left( {1 - {e_k}\left( {{\beta _m}} \right)} \right)} \lt \sum\limits_{j = 0}^\infty {\prod\limits_{k = 0}^j {\left( {1 - {e_k}\left( {{\beta _m}} \right)} \right)} }$ gives
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_equ18.gif?pub-status=live)
This leaves
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191010105823627-0567:S0003055419000364:S0003055419000364_eqna7.gif?pub-status=live)
Arguing by contradiction, assume $\mathop {\lim }\nolimits_{m \to \infty } D\left( {\pi \left( {{\beta _m}} \right)} \right)$ does not become unboundedly large. If not, then the right side of inequality (A7) goes to negative infinity. This contradiction establishes the claim
$\mathop {\lim }\nolimits_{m \to \infty } D\left( {\pi \left( {{\beta _m}} \right)} \right) = \infty$.
Now assume that $\left\{ {\pi \left( {{\beta _m}} \right)} \right\}_{m = 0}^\infty$ is a sequence of payoff-maximizing equilibrium paths. By definition, V P(π(β m)) = V(π(β m)) − V A(π(β m)). We also have
$V\left( {\pi \left( {{\beta _m}} \right)} \right) = V\left( {\hat{s}\left( {{\beta _m}} \right)} \right)$ by Proposition A1 which yields
${V\!_{\it{P}}}\left( {\pi \left( {{\beta _m}} \right)} \right) = V\left( {\hat{s}\left( {{\beta _m}} \right)} \right) - {V\!_{\it{A}}}\left( {\pi \left( {{\beta _m}} \right)} \right)$. The proof of Proposition A1 also shows that the agent’s incentive constraint must bind at t = 0 which implies V A(π(β m)) = x 0(β m) − β md/(1 − β m). Using the fact that
${x_0}\left( {{\beta _m}} \right) \,\le\, \hat{x}$ gives
${V\!_{\it{P}}}\left( {\pi \left( {{\beta _m}} \right)} \right) \ge V\left( {\hat{s}\left( {{\beta _m}} \right)} \right) - \hat{x} + {{{\beta _m}d} / {\left( {1 - {\beta _m}} \right)}}$. But
${V\!_{\it{A}}}\left( {\hat{s}\left( {{\beta _m}} \right)} \right) = \hat{x} - {{{\beta _m}d} / {\left( {1 - {\beta _m}} \right)}}$, so
${V\!_{\it{P}}}\left( {\pi \left( {{\beta _m}} \right)} \right) \,\ge \,V\left( {\hat{s}\left( {{\beta _m}} \right)} \right) - {V\!_{\it{A}}}\left( {\hat{s}\left( {{\beta _m}} \right)} \right) = {V\!_{\it{P}}}\left( {\hat{s}\left( {{\beta _m}} \right)} \right)$. Finally,
$\hat{e}\left( {{\beta _m}} \right) > {\underline{e}}\left( {\hat{x},{\beta _m}} \right)$ when
$\hat{x} \,\lt \,{x^ + }$, and
$\hat{e}\left( {{\beta _m}} \right) > {\underline{e}}\left( {\hat{x},{\beta _m}} \right)$ means that the principal’s incentive constraint is slack at
$\hat{s}\left( {{\beta _m}} \right)$ (see Figure 2). A slack incentive constraint ensures
${V\!_{\it{P}}}\left( {\pi \left( {{\beta _m}} \right)} \right) \,\ge\, {V\!_{\it{P}}}\left( {\hat{s}\left( {{\beta _m}} \right)} \right) > {{\left( {y - p} \right)} / {\left( {1 - {\beta _m}} \right)}}$. In words, the principal strictly prefers to work through the agent. □
Comments
No Comments have been published for this article.