1. INTRODUCTION AND PROBLEM FORMULATION
We consider a controlled finite-capacity Markovian queue with m types of customer, where m = 1,2,… . Type i customers arrive according to a Poisson process with the intensity λi, i = 1,…, m, and these m arrival Poisson processes are independent. When a customer arrives, its type becomes known. When there are N customers in the system, the system is full and new arrivals are lost. If the system is not full, upon an arrival of a new customer, a decision to accept or reject this customer is made. A positive reward ri is collected for serving an accepted type i customer. A nonnegative cost vector Ci = (C1,i,C2,i,…,CK,i)τ is incurred due to the rejection or loss of an arriving type i customer, where K is the number of constraints in this problem. We assume that the service time of a customer does not depend on the customer type. When there are n customers in the queue, the departure rate is μn, n = 1,…, N. The numbers μn, n = 1,…, N, satisfy the condition μn−1 ≤ μn, where μ0 = 0 and μ1 > 0. For example, for an M/M/c/N queue, for some μ > 0,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm001.gif?pub-status=live)
Note that, unless otherwise specified, we do not assume that r1 ≥ r2 ≥ ··· ≥ rm.
Our goal is to maximize the average rewards per unit time, subject to multiple constraints on the average costs per unit time. This research is motivated by the following question: What is the structure of optimal policies for the problem when the blocking probabilities for some of the customer types are not allowed to exceed given numbers? The answer to this question is given in Corollary 2.4. Additionally, Corollary 2.5 provides an answer to a more general problem when different types of customer can be pooled into groups with common blocking probability constraints on each group. Previously, Fan-Orzechowski and Feinberg [3] solved such a problem with a single constraint on the blocking probability of one type of customer or one group. Feinberg and Reiman [7] solved a more particular problem when all of the rewards are different and the constraint on the blocking probability is applied to the most profitable type of customer.
Consider [Kscr ] = 0,1,…, m − 1. A [Kscr ]-randomized trunk reservation policy φ is defined by m numbers Miφ, 0 ≤ Miφ ≤ N − 1, i = 1,…, m, called the thresholds. Of these thresholds, at most [Kscr ] numbers are noninteger and at least one number equals N − 1. For a number M we denote the integer part of M by [lfloor ]M[rfloor ]. If the system is controlled by the policy φ, a type i arrival will be admitted with probability 1 if it sees no more than [lfloor ]Miφ[rfloor ] customers in the system, it will be rejected if the number of customers it sees in the system exceeds [lfloor ]Miφ[rfloor ] + 1, and it will be accepted with the probability (Miφ − [lfloor ]Miφ[rfloor ]) and rejected with the probability 1 − (Miφ − [lfloor ]Miφ[rfloor ]) if it sees exactly [lfloor ]Miφ[rfloor ] + 1 customers in the system at the arrival epoch. In particular, if the number Miφ is an integer, a type i arrival will be admitted if and only if it sees no more that Miφ customers in the system. Thus, Miφ = N − 1 means that a type i arrival is admitted whenever the system is not full. A randomized trunk reservation policy φ is called consistent with a function r′ defined on the set {1,…, m} if ri′ > rj′ implies Miφ ≥ Mjφ for i,j = 1,…, m. If all of the thresholds are integers, the randomized trunk reservation policy is called a trunk reservation policy. We sometimes write Mi instead of Miφ for the thresholds when there is only one policy in the context and no confusion will occur.
In this article we allow the possibility that the number of constraints K is not necessarily less than m and we introduce Km = min{K,m − 1}. We prove that if the problem is feasible, there exists a Km-randomized trunk reservation policy consistent with the reward function
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm001.gif?pub-status=live)
where uk ≥ 0 is the Lagrange multiplier with respect to the kth constraint of the linear programming problem formulated in this article. Additionally, Theorem 2.2 shows that any Km-randomized stationary optimal policy is a Km-randomized trunk reservation policy consistent with r′.
In Feinberg and Reiman [7, Sects.6 and 7], several other types of optimal policy and optimal nonrandomized strategy were constructed by transforming an optimal randomized trunk reservation policy. Similar results can be obtained for the more general problem considered in this article. In fact, these constructions hold as long as the optimality of randomized trunk reservation policies is established.
Miller [13] studied a one-criterion problem for an M/M/c/loss queue when r1 > r2 > ··· > rm. In this case, there exists an optimal nonrandomized trunk reservation policy consistent with r. In other words, all of the thresholds Mi are integers and N − 1 = M1 ≥ M2 ≥ ··· ≥ Mm. Feinberg and Reiman [7] studied a constrained problem with r1 > r2 > ··· > rm, where the goal was to maximize average rewards per unit time subject to the constraint that the blocking probability for type 1 customers does not exceed a given level. Feinberg and Reiman [7] proved the existence of an optimal 1-randomized trunk reservation policy with N − 1 = M1 ≥ M2 ≥ ··· ≥ Mm. Fan-Orzechowski and Feinberg [3] considered a problem with a single constraint on the average costs. The goal was to maximize the average rewards per unit time subject to this constraint. They proved the existence of a 1-randomized trunk reservation policy consistent with the reward function
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm005.gif?pub-status=live)
where u1 ≥ 0 is the Lagrange multiplier with respect to the first constraint of the linear programming problem formulated in that article. In particular, Fan-Orzechowski and Feinberg [3] solved the problem with one constraint on the blocking probability for type k customers, k = 1,…, m.
In addition to Miller's [13] classical problem formulation, various versions and generalizations of the admission problem have been studied in the literature. See the references found in Fan-Orzechowski and Feinberg [3]. Recent research in this area includes Lewis, Ayhan, and Foley [9,10], Lewis [8], Lin and Ross [11,12], Altman, Jimenez, and Koole [1], and Altman [2]. If the service times depend on customer types or different types of customer require different numbers of servers, the problem becomes NP-hard and trunk reservation might not be optimal; see Ross [14, p.137] and Altman et al. [1].
The rest of this article is organized as follows. Section 2 contains the problem formulation, introduces a linear program (LP) that identifies an optimal policy, and states the main result—the optimality of randomized trunk reservation policies and its corollaries. Section 3 introduces the Lagrangian relaxation, which reduces the number of constraints, and provides the proofs.
2. SEMI-MARKOV DECISION MODEL AND MAIN RESULTS
Following Feinberg and Reiman [7] and Fan-Orzechowski and Feinberg [3], we model the problem via a semi-Markov decision process (SMDP). Since the sojourn times between actions are exponentially distributed, this problem is an exponential semi-Markov decision process (ESMDP); see Feinberg [6] for details. Notice that this problem can also be formulated as a continuous-time Markov decision process (CMDP). The extra technical difficulty in using CMDP is to prove that the controlled process has no absorbing states; see Feinberg [5]. We can then reach the same preliminary result as by using the SMDP model in this article; when the problem is feasible, there exists a randomized stationary optimal policy that uses a randomization procedure in at most Km states.
In the framework of an SMDP model, we define the state space I = {0,1,…,N − 1} ∪ ({0,1,…, N} × {1,…, m}), which represents the system state at the departure and arrival epochs. If the state of the system is n = 0,…, N − 1, this means that a departing customer leaves n customers in the system. The state (n,i) means that an arrival of type i sees n customers in the system.
The action set A = {0,1}. For n = 0,…, N − 1 and i = 1,…, m, we set A(n,i) = A = {0,1} and A(N,i) = {0}, where the action 0 means that the type i arrival should be rejected or is lost and the action 1 means that it should be accepted. In any state n = 0…,N − 1, we set A(n) = {0}.
Let τ(s,a) denote the average time that the system spends in a state s ∈ I if action a ∈ A(s) is chosen in this state. Let p(s,s′,a) be the transition probability from the state s to s′ if action a ∈ A(s) is chosen. Note that p(s,s′,a) does not depend on the sojourn time in state s under action a and this sojourn time has an exponential distribution with the average τ(s,a) . For notational convenience, we write τ(n) and p(n,s′) instead of τ(n,0) and p(n,s′,0), respectively, for n = 0,…, N − 1, s′ ∈ I. Denote
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm006.gif?pub-status=live)
.
We have τ(n) = (μn + Λ)−1, where n = 0,…, N − 1. Also, for i = 1,…, m,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm007.gif?pub-status=live)
For n = 0,…, N − 1 and i = 1,…, m,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm008.gif?pub-status=live)
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm009.gif?pub-status=live)
Suppose that the reward is accrued upon acceptance of a customer and costs are incurred upon rejection. The reward function is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm010.gif?pub-status=live)
and for k = 1,…, K, the kth cost function is
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm011.gif?pub-status=live)
In addition, r(n,0) = ck(n,0) = 0 for all n = 0,…, N − 1 and for all k = 1,…, K.
We define the long-run average rewards earned by the system under strategy π as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm012.gif?pub-status=live)
and the long-run average costs of the system under strategy π as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm013.gif?pub-status=live)
where z is the initial state, xn is the state at epoch tn,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm014.gif?pub-status=live)
is the expectation operator for the initial state z and strategy π, and N(t) = max{n : tn ≤ t} is the number of jumps by time t.
A strategy is called a randomized stationary policy if assigned actions an depend only on the current state xn. Additionally, if an is a deterministic function of xn, the corresponding strategy is called a stationary policy. Since the action sets are nonsingletons only at the arrival epochs, a randomized stationary policy φ for our problem is defined by φ(n,i), n = 0,…, N − 1, i = 1,…, m, the probability of accepting an arrival of type i when the arrival sees n customers in the system. A randomized stationary policy φ is called k-randomized stationary, k = 0,1,…, if the number of states (n,i) such that 0 < φ(n,i) < 1 is less than or equal to k. The notions of stationary and 0-randomized stationary policies coincide.
Notice that the unichain condition (any stationary policy defines a Markov chain with one recurrent class) holds for this model. If an SMDP satisfies the unichain condition, then, according to Feinberg [4, Thm.9.2], for a feasible problem under the average rewards per unit time criterion with K constraints, there exists an optimal K-randomized stationary policy. Therefore, similar to Fan-Orzechowski and Feinberg [3], we model our problem as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm002.gif?pub-status=live)
where a randomized stationary policy φ is the variable and Gk are real numbers.
Consider the following LP with variables (x,P), where x = {x(n,i) : n = 0,…, N − 1,i = 1,…, m} and P = (P0,…, PN).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm003.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm004.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm005.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm006.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm007.gif?pub-status=live)
The variables x and P have the following meaning: x(n,i) is the joint probability that a type i arrival sees n customers in the system and this arrival is accepted; Pn is the limiting probability that there are n customers in the system. For the queue controlled by any randomized stationary policy, its associated quantities (x,P) satisfy
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm015.gif?pub-status=live)
, since Poisson arrivals see time averages (PASTA).
Constraints (2.4) and (2.5) correspond to the birth-and-death balance equations and constraints (2.3) correspond to K cost constraints. Constraints (2.5) and (2.6) imply that the set of feasible solutions of the LP (2.2)–(2.6) is bounded and, therefore, if a feasible solution exists, the LP has an optimal solution.
For a vector (x,P) satisfying (2.4)–(2.6), consider a randomized stationary policy φ such that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm008.gif?pub-status=live)
The following theorem links problem (2.1) and LP (2.2)–(2.6). We recall that Km = min{K,m − 1}.
Theorem 2.1:
(i) A randomized stationary policy φ is feasible for problem (2.1) if and only if (2.7) holds for a feasible vector (x,P) of LP (2.2)–(2.6).
(ii) If (x,P) is an optimal solution of LP (2.2)–(2.6), then Pn > 0 for all n = 0,1,…, N.
(iii) A randomized stationary policy φ is optimal for problem (2.1) if and only if the equation
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm009.gif?pub-status=live)
holds for an optimal solution (x,P) of LP (2.2)–(2.6). Additionally, if (x,P) is a basic optimal solution of LP (2.2)–(2.6), then the policy φ defined by (2.8) is Km-randomized stationary optimal.
The proof of Theorem 2.1 is presented in Section 3. In particular, Theorem 2.1(iii) implies that if problem (2.1) is feasible, then there exists an optimal Km-randomized stationary policy. Let (u,v), where u = (u1,…,uK,uK+1,…,uK+2Nm) and v = (v0,…,vN), be an optimal solution of the LP dual to (2.2)–(2.6). The dual vector uK = (u1,…,uK) corresponds to constraints (2.3), the dual vector (uK+1,…,uK+2Nm) corresponds to the inequality constraints (2.6), the dual vector (v0,…,vN−1) corresponds to constraints (2.4), and the dual variable vN corresponds to the constraint (2.5). We call u and v the Lagrange multipliers for LP (2.2)–(2.6) and note that uk ≥ 0, k = 1,…, K; see [3, Appendix B] for additional details.
The following theorem, which is the main result of this work, implies the existence of optimal Km-randomized trunk reservation policies stated in its Corollary 2.3.
Theorem 2.2: Any Km-randomized stationary optimal policy for problem (2.1) is a Km-randomized trunk reservation policy, which is consistent with the reward function ri′ = ri + uKCi, i = 1,…, m, where uK = (u1,u2,…,uK) ≥ 0 is a vector of Lagrange multipliers with respect to constraints (2.3) in LP (2.2)–(2.6).
Corollary 2.3: If problem (2.1) is feasible, then there exists an optimal Km-randomized trunk reservation policy consistent with the reward function r′ defined in Theorem 2.2.
To conclude this section, consider two important corollaries of Theorem 2.2. Corollary 2.4 deals with the situation when there are constraints on the blocking probabilities of certain customer types. Corollary 2.5 deals with the more general situation when there are blocking probability constraints on several customer types pooled together.
According to [7, p.471], for the costs vector Ci, i = 1,…, m, with the coordinates Ck,i, k = 1,…, K,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm010.gif?pub-status=live)
the average cost Wk(z,π) is the blocking probability for type k customers.
Corollary 2.4: Consider a special case of problem (2.1) when there is a fixed subset J = {i1,…, iK} of customer types and the goal is to maximize the average rewards per unit time subject to K constraints on the blocking probabilities for each of these customer types. For this problem, introduce the costs Ck,i defined by (2.9) when i = ik and Ck,i = 0 when i ∉ J, k = 1,…, K, i = 1,…, m. If this problem is feasible, then any Km-randomized stationary optimal policy is Km-randomized trunk reservation policy consistent with the reward function r′ defined as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm016.gif?pub-status=live)
where uK = (u1,u2,…,uK) ≥ 0 is the vector of Lagrange multipliers with respect to constraints (2.3) of LP (2.2)–(2.6). Therefore, there exists a Km-randomized trunk reservation optimal policy consistent with the reward r′ if the problem is feasible.
Consider a more general problem when there are K subsets of the customer types J1,…, JK and the goal is to maximize the average rewards per unit time subject to the constraints on the blocking probabilities for the customer types from the class Jk pooled together, k = 1,…, K. For K = 1, this problem was solved in [3, Cor. 2.5]. When each subset Jk, k = 1,…, K is a singleton, the solution is described in Corollary 2.4.
Let
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm017.gif?pub-status=live)
. Consider the cost functions Ck, k = 1,…, K, defined as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm011.gif?pub-status=live)
Corollary 2.5: Consider a special case of problem (2.1) when there are K fixed groups J1,…, JK of customer types and the goal is to maximize the average rewards per unit time subject to K constraints on the blocking probabilities for the customers from each of these customer groups pooled together within a group. If this problem is feasible, then any Km-randomized stationary optimal policy is a Km-randomized trunk reservation policy consistent with the rewards
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm018.gif?pub-status=live)
, where i = 1,…, m, and uK = (u1,u2,…,uK) ≥ 0 is the vector of Lagrange multipliers with respect to constraints (2.3) of LP (2.2)–(2.6) with the costs CK,i defined in (2.10). Therefore, there exists a Km-randomized trunk reservation optimal policy consistent with the reward r′ if the problem is feasible.
3. PROOFS
In view of (2.5) and (2.6), the feasible region of LP (2.2)–(2.6) is bounded. Therefore, this LP has an optimal solution if it is feasible. If LP (2.2)–(2.6) is feasible, consider the vector of Lagrange multipliers uK = (u1,…,uK) with respect to constraints (2.3) in LP (2.2)–(2.6). If LP (2.2)–(2.6) is feasible, we introduce the following LP:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm012.gif?pub-status=live)
Lemma B.1 in [3] implies the following result.
Lemma 3.1: If LP (2.2)–(2.6) is feasible, then (i) any optimal solution of LP (2.2)–(2.6) is an optimal solution of LP (3.1) and (ii) the optimal values of objective functions for these two LPs are equal.
This lemma plays an important role in the proof of the main result of this article: Theorem 2.2. We refer to this technique as the Lagrangian relaxation in this article. In general, the Lagrangian relaxation refers to using a weak duality theorem to obtain lower bounds for a nonlinear programming problem. We use the term “relaxation” here in the sense that, although the optimal set is enlarged after the transformation, the optimal value remains the same.
Consider the unconstrained problem
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm013.gif?pub-status=live)
Similar to [3], we are interested in this problem when the rewards for served customers are equal to ri′, i = 1,…, m, where the function r′ was defined in Theorem 2.2. Since the second term in the objection function of LP (3.1) is constant, in view of Theorem 2.1 an optimal solution (x,P) of LP (3.1) defines via (2.8) an optimal randomized stationary policy φ for problem (3.2) with the reward functions r′.
To prove Theorem 2.2, we first formulate certain generic properties of optimal solutions of problem (3.2) with the rewards r and then apply them to the specific case of the rewards r′. Note that we do not assume distinct rewards among different customer types. This extension might not seem significant at first glance, but it is important. In fact, even if we assume that all of the rewards ri are distinct, after the Lagrangian relaxation it is possible that ri′ = rj′ for some i,j = 1,…, m in a new unconstrained problem.
Lemma 3.2 [3, Cor. 2.1]:
(i) If (x,P) is an optimal solution of LP (2.2), (2.4)–(2.6), then Pn > 0 for all n = 0,1,…, N.
(ii) A randomized stationary policy φ is optimal for problem (3.2) if and only if (2.8) holds for an optimal solution (x,P) of LP (2.2), (2.4)–(2.6). In addition, if (x,P) is a basic optimal solution of LP (2.2), (2.4)–(2.6), then the policy φ defined in (2.8) is (nonrandomized) stationary optimal.
The following lemma is used in the proof of Theorems 2.1 and 2.2.
Lemma 3.3 [3, Lemma 3.3]: Consider any randomized stationary optimal policy φ for the unconstrained problem (3.2).
(i)For any i and j, such that ri > rj,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm014.gif?pub-status=live)
(ii) For each n = 0,…, N − 1, if there exist two customer types j1 and j2 such that 0 < φ(n,j) < 1, j = j1,j2, then rj1 = rj2. In particular, if all of the rewards r1,…, rm are different, then for each n = 0,…, N − 1, all of the probabilities φ(n,j), j = 1,…, m, except at most one, are equal to either zero or one.
(iii) There exists at least one customer type, say type [ell ], such that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm015.gif?pub-status=live)
In particular, if rj = max{ri|i = 1,…, m}, then (3.4) holds with [ell ] = j.
(iv)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xfrm016.gif?pub-status=live)
and for each j = 1,…, m, all of the probabilities φ(n,j), n = 0,…, N − 1, except at most one, are equal to either zero or one.
Corollary 3.4: Any randomized stationary optimal policy φ for the unconstrained problem (3.2) is an (m − 1)-randomized trunk reservation policy consistent with the rewards ri. Additionally, if all of the rewards ri are distinct, such a policy is s-randomized, where s = min{m − 1,N}.
Corollary 3.5: Any stationary optimal policy φ for the unconstrained problem (3.2) is a trunk reservation policy consistent with the rewards ri.
Proof of Theorem 2.1: The proof of (i) and (ii) and the first statement of (iii) are identical to the proof of Theorem 2.1 in [3]. We will prove the second statement of (iii). First, consider the case when K < m. We represent LP (2.2)–(2.6) in a standard LP form, where nonnegative variables Sk, k = 1,…, K, are introduced to replace (2.3) with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm019.gif?pub-status=live)
and nonnegative variables y(n,i), n = 0,…, N − 1, i = 1,…, m, are introduced to replace (2.6) with x(n,i) + y(n,i) = Pn. There are K + N + 1 + N × m constraints and K + 1 + N + 2(N × m) variables for this new LP. Therefore, any basic optimal solution of this new LP has at most K + N + 1 + N × m basic variables. Since Pn, n = 0,…, N, are positive, there are at most K + N × m basic variables among x(n,i) and y(n,i). Because x(n,i) + y(n,i) = Pn > 0, x(n,i) and y(n,i) cannot be equal to zero simultaneously. Therefore, for each pair (n,i), either x(n,i) = 0 or y(n,i) = 0, except at most K pairs where both x(n,i) and y(n,i) are not equal to zero. Since φ(n,i) = x(n,i)/Pn, we have that for all pairs (n,i), except at most K, φ(n,i) equals either zero or one. Therefore, the policy φ is K-randomized stationary optimal. For K ≥ m, we note that the matrix C = (Ck,i) is K × m for constraints in (2.3) and, thus, its rank is at most m. After removing redundant constraints in (2.3), if there are l constraints left in (2.3) and l < m, we are back to the previous case and the proof is complete. If l = m, the only extra piece we need to add is to show that the policy φ is (m − 1)-randomized. Lemmas 3.1 and 3.2 imply that φ is optimal for an unconstrained problem. The rest follows from Lemma 3.3(iii). █
We remark that, in the case of K ≥ m, it is also intuitive that the randomized trunk reservation policy actually has at most m − 1 randomized entries. Indeed, for the most profitable customer type, who has the highest reward r′, the system will always accept it when it is not full. Otherwise the system idles and waits to serve the next potentially less profitable customer. Thus, such a policy is suboptimal. This is also consistent with the definition of a [Kscr ]-randomized trunk reservation, in which at least one threshold equals N − 1.
Proof of Theorem 2.2: Consider any Km-randomized stationary optimal policy φ for problem (2.1). Lemma 3.2(ii), Theorem 2.1(iii), and Lemma 3.1 imply that φ is optimal for an unconstrained problem with the rewards
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm021.gif?pub-status=live)
. Lemma 3.3 implies that φ is a randomized trunk reservation policy consistent with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170224081508607-0341:S026996480707012X:S026996480707012Xffm022.gif?pub-status=live)
. █
Acknowledgment
This research was partially supported by NSF grants DMI-0300121 and DMI-0600538.