OPTIMAL ROUTING IN OUTPUT-QUEUED FLEXIBLE SERVER SYSTEMS

Alexander L. Stolyar

doi:10.1017/S0269964805050096

OPTIMAL ROUTING IN OUTPUT-QUEUED FLEXIBLE SERVER SYSTEMS

Published online by Cambridge University Press: 23 March 2005

Alexander L. Stolyar

Show author details

Alexander L. Stolyar: Affiliation:
Bell Labs, Lucent Technologies, Murray Hill, New Jersey 07974, stolyar@research.bell-labs.com

Article contents

Abstract
1. INTRODUCTION
2. BASIC NOTATION AND CONVENTIONS
3. THE MODEL
4. THE MinDrift ROUTING RULE
5. COMPLETE RESOURCE POOLING CONDITION
6. EQUIVALENT CHARACTERIZATION OF THE CRPCONDITION IN TERMS OF THE OQ MODEL
7. HEAVY TRAFFIC REGIME
8. MAIN RESULTS
9. FLUID SAMPLE PATHS UNDER THE MinDriftRULE
10. PROOF OF THEOREM 1
11. PROOF OF THEOREM 2
12. PROOF OF THEOREM 4
13. STABILITY VERSUS HEAVY TRAFFIC WORKLOADMINIMIZATION
Acknowledgment
APPENDIX: The One-Dimensional SkorohodProblem
References

Rights & Permissions

Abstract

We consider a queuing system with multitype customers and nonhomogeneous flexible servers, in the heavy traffic asymptotic regime and under a complete resource pooling (CRP) condition. For the input-queued (IQ) version of such a system (with customers being queued at the system “entrance,” one queue per each type), it was shown in the work of Mandelbaum and Stolyar that a simple parsimonious Gcμ scheduling rule is optimal in that it asymptotically minimizes the system customer workload and some strictly convex queuing costs. In this article, we consider a different—output-queued (OQ)—version of the model, where each arriving customer must be assigned to one of the servers immediately upon arrival. (This constraint can be interpreted as immediate routing of each customer to one of the “output queues,” one queue per each server.) Consequently, the space of controls allowed for an OQ system is a subset of that for the corresponding IQ system.

We introduce the MinDrift routing rule for OQ systems (which is as simple and parsimonious as Gcμ) and show that this rule, in conjunction with arbitrary work-conserving disciplines at the servers, has asymptotic optimality properties analogous to those Gcμ rule has for IQ systems. A key element of the analysis is the notion of system server workload, which, in particular, majorizes customer workload. We show that (1) the MinDrift rule asymptotically minimizes server workload process among all OQ-system disciplines and (2) this minimal process matches the minimal possible customer workload process in the corresponding IQ system. As a corollary, MinDrift asymptotically minimizes customer workload among all disciplines in either the OQ or IQ system.

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 19 , Issue 2 , April 2005 , pp. 141 - 189

DOI: https://doi.org/10.1017/S0269964805050096 [Opens in a new window]
Copyright: © 2005 Cambridge University Press

1. INTRODUCTION

1.1. The Problem

We consider a queuing system with multiple customer flows (types) i = 1,…,I and nonhomogeneous flexible servers j = 1,…,J. This means that the mean service time μ_ij⁻¹ of a type i customer by server j depends on both the customer type and the server. We study the “heavy traffic” asymptotic regime, when the system is close to be “critically loaded,” and assume that a certain complete resource pooling (CRP) condition holds. Associated with the CRP condition is the notion of system workload, which in this article is called system customer workload.

The “input-queued” (IQ) version of the model (see [3,8,9,12,20]) is such that arriving customers are placed in “input” queues, one queue per each type i, where they await for service without being preassigned to any particular server until they are actually “taken for service” by one of them. It is shown in [12] that an IQ system can be asymptotically optimally controlled in heavy traffic (and under the CRP condition) by a very simple and parsimonious generalized cμ (Gcμ) scheduling rule, which, in particular, minimizes customer workload among virtually all service disciplines.

In this article, we consider a different—“output-queued” (OQ)—version of the model, where each arriving customer must be assigned (or routed) to one of the servers immediately upon arrival. (This can be viewed as immediate routing of each arriving customer to one of the “output” queues, one per each server.) Such models arise in various applications, including wireless networks, manufacturing systems, and call centers. A wireless application example is a system in which data packets (“customers”) need to be delivered to multiple mobile users/destinations (which determine “customer types”) via a set of transmitters (“servers”); transmission (“service”) rates depend on the (different) channel qualities between different transmitters and users.

Due to the above immediate routing (IR) constraint, the space of controls allowed for an OQ system is a (strict) subset of that for the corresponding IQ system. (Gcμ is not a valid discipline for OQ systems.) A natural question is: “Is it possible to control an OQ system in heavy traffic as efficiently as the corresponding IQ system could be controlled? For example, are there OQ-system controls that are as parsimonious as Gcμ but still able to minimize the system customer workload?” The results of this article demonstrate that the answer to the latter question is “yes”—we introduce the MinDrift routing rule, which, in particular, minimizes the system customer workload in heavy traffic. We will describe our results shortly, after a brief literature review.

1.2. Previous Work

The IQ version of our system (with nonhomogeneous flexible servers) and the definition of the heavy traffic asymptotic regime for it (in terms of a certain linear program) were introduced in [8] (for a special two-server system) and in [9,20] (for the general setting). For the system in heavy traffic, these articles also define the CRP condition and associated with it the notion of customer workload, defined as X(t) = [sum ]_i ν_i*Q_i(t), t ≥ 0, where Q_i(t) is the type i queue length at time t and ν_i* > 0 is the fixed constant called workload contribution of a type i customer. References [8,9] propose discrete-review scheduling policies, which, in the heavy traffic and under CRP condition, asymptotically minimize customer workload and linear holding costs. In [3,20], continuous-review threshold policies are proposed that are asymptotically optimal, in the same sense and also under CRP condition. (The asymptotic optimality proofs are given for a special two-server system.) The common feature of discrete-review and continuous-review threshold policies is that they require a priori knowledge of the flows' mean arrival rates λ_i.

In [12], it is proved that a very simple Gcμ scheduling rule (which, in particular, does not require the knowledge of arrival rates λ_i) asymptotically minimizes customer workload and strictly convex holding costs in a general IQ system under the CRP condition. Moreover, in the limit, the (appropriately rescaled) queue-length vector process (Q₁(t),…,Q_I(t)) exhibits state space collapse (SSC): it “lives” on a one-dimensional manifold. (The results of [12] are closely related to earlier results in [14] for a discrete-time generalized switch model. Also, they generalize the earlier Gcμ optimality results for a single-server system [17].) Following [14], [12] provides equivalent (geometric) characterization of the CRP condition, as follows. Let

be the system service rate region, which is roughly the set of all vectors representing feasible long-term average service rates the system is capable of jointly providing to different types. Then the vector ν* = (ν₁*,…,ν_I*) of workload contributions is the unique (up to scaling) outer normal vector to the boundary of

at the point λ = (λ₁,…,λ_I).

Most of the previous work on OQ systems is concentrated on load balancing schemes for systems with homogeneous servers. Much less work has been done on a heavy traffic regime in systems with nonhomogeneous flexible servers. Probably the first was [10], where a two-server system is considered, resource pooling in heavy traffic is discussed, and threshold-based policies are proposed. (See also [11] for an earlier discussion of resource pooling in systems with routing.) In a recent article [16] a two-server system (different from that in [10]) with exponential service times is considered, and the asymptotic optimality of a threshold routing policy is proved, under linear holding costs. We refer the reader to [16] for a more extensive review of the previous work on OQ systems.

1.3. Our Results

In this article, we consider a general OQ system in a heavy traffic regime and under the CRP condition. First, we give further equivalent characterization of the CRP condition, which is natural and convenient for the analysis of OQ systems; namely, we consider the server utilization region

, which is the set of potential server utilization vectors that can be imposed by the input flows with mean rates λ_i or greater. Then the CRP condition, in particular, implies the uniquenes (up to scaling) of the vector normal to

at the boundary point (1,…,1). We call components α_j* > 0 of the vector α* = (α₁*,…,α_J*), opposite of the above-mentioned normal vector, server workload contributions of different servers, and we call by system server workload the quantity ^uX(t) = [sum ]_j α_j*U_j(t), where U_j(t) is the unfinished work of server j at time t. We establish the relations between customer and server workload contributions, which show that the asymptotic relation X(t) ≤ ^uX(t) between customer workload and server workload exists.

We assume that a strictly convex increasing function C_j(·) is defined for each server j, which is interpreted as the cost rate incurred by the unfinished work on server j.

We introduce two versions of the MinDrift routing rule. MinDrift(U) assigns an arriving type i customer to a server

(This version may not be practical in many cases, because it assumes exact knowledge of the unfinished work values U_j(t).) The MinDrift(Q) rule is the same as MinDrift(U), except U_j(t) in (1) is replaced by the Q-estimated unfinished work ^qU_j(t) = [sum ]_m μ_mj⁻¹Q_mj(t) of server j, where Q_ij(t) denotes the number of type i customers in the server j queue. (MinDrift(Q) is a more practical version.)

Our main result (see Theorems 1 and 2) is that, in the OQ system in the heavy traffic asymptotic limit and under the CRP condition, the MinDrift rule (either version), in conjunction with virtually arbitrary work-conserving scheduling disciplines at the servers, minimizes (among all service disciplines) the server workload and the instantaneous and cumulative costs corresponding to the cost rate [sum ]_j C_j(U_j(t)). Moreover, in the limit, the (rescaled) unfinished work vector process (U₁(t),…,U_J(t)) exhibits SSC such that the vector (C₁′(U₁(t)),…,C_J′(U_J(t))) is always proportional to α*. (This behavior is analogous to that exhibited by IQ systems in [12,14], but with the server unfinished works replacing “input”-queue lengths and server workload contributions replacing customer workload contributions.) In addition, the minimal server workload process (attained under MinDrift) matches the minimal possible customer workload process in the corresponding IQ system. As a corollary, MinDrift(Q) (asymptotically) minimizes customer workload among all disciplines in either the OQ or IQ system (see Theorem 3). In this sense, the MinDrift rule controls an OQ system as efficiently as the corresponding IQ system (allowing a wider class of disciplines) can possibly be controlled.

Essentially as another corollary of the main results, we obtain a necessary condition (Theorem 4) for any OQ-system service discipline to have a (asymptotic) workload minimization property. Using this condition, we demonstrate (in Section 13) that even some very natural service disciplines in OQ systems, known to guarantee stability of the queues (if such is feasible at all), do not minimize system workload in heavy traffic.

Another contribution of the article is that, in addition to the CRP condition, we in fact identify and characterize a weaker First-Order CRP (FO-CRP) condition. The purpose of doing this is twofold. First, FO-CRP is sufficient to establish convergence properties of fluid sample paths, which arise in the fluid limit asymptotic regime and (in addition to being an important step in proving the main heavy traffic results) are of independent interest. Second, this clarifies the role of the additional assumption, which strengthens FO-CRP to CRP, in proving the heavy traffic results.

Finally, we would like to point out that despite the fact that the technical development in this article is in many ways analogous to that in [12,14], some parts of it are quite different. In particular, the representation of the server workload process in Sections 9 and 10 (roughly, as a sum of the “driving” and “regulation” processes) is substantially different from the representation of customer workload processes in [12,14].

1.4. Outline of This Work

In Section 2, we set basic notations and conventions. The OQ-system model is formally introduced in Section 3. In Section 4, we define the (two versions of) MinDrift routing rule and discuss its basic intuition and some examples. The definition and characterization of FO-CRP and CRP conditions in terms of an IQ system are presented in Section 5. Section 6 gives an equivalent characterization of FO-CRP and CRP conditions in terms of an OQ system and establishes relations between customer and server workload contributions. The heavy traffic asymptotic regime is defined in Section 7, which also contains the definitions of and the relations between customer workload and server workload. Section 8 contains formulations of our main results (Theorems 1–4), described earlier. The analysis of fluid sample paths is the subject of Section 9. Section 10 contains the proof of Theorem 1, regarding the asymptotic optimality of the MinDrift(U) rule. The proof of Theorem 2 (regarding the MinDrift(Q) rule) can essentially be reduced to that of Theorem 1—a detailed outline of this reduction is given in Section 11. Theorem 4, a necessary condition for asymptotic workload minimization, is proved in Section 12. We conclude in Section 13 with a discussion of the relation between stability and heavy traffic optimality properties of service disciplines.

2. BASIC NOTATION AND CONVENTIONS

We use the standard notations R and R₊ for the sets of real and real nonnegative numbers, respectively; the not quite standard R₊₊ is used for the set of strictly positive real numbers. Corresponding N-times product spaces are denoted R^N, R₊^N, and R₊₊^N. The space R^N is viewed as a standard vector-space, with elements x ∈ R^N being row-vectors x = (x₁,…,x_N). We write simply 0 for the zero vector in R^N and 1 [esdot ] (1,1,…,1) for a vector with all unit coordinates. (The dimensions of vectors 0 and 1 are either specified explicitly or are clear from the context.)

The scalar product (dot product) of x,y ∈ R^N is

and the norm of x is

Vector inequalities are to be understood componentwise. As an example, for γ,x ∈ R^N, γ < x means γ_i < x_i, i = 1,…,N. Also,

and if γ ∈ R₊₊^N, we slightly abuse notation by writing

We denote the minimum and maximum of two real numbers ξ₁ and ξ₂ by ξ₁ ∧ ξ₂ and ξ₁ ∨ ξ₂, respectively.

Let D([0,∞),R) be the standard Skorohod space of right-continuous left-limit (RCLL) functions, defined on [0,∞) and taking real values. (See, for example, [7] for the definition of this space and its associated topology and σ-algebra.)

The symbol

denotes convergence in distribution of random processes (or other random elements) (i.e., weak convergence of their distributions). Typically, we consider convergence of processes in D([0,∞),R), or its N-times product space D^N([0,∞),R) equipped with product topology and σ-algebra.

The symbol

(or the abbreviation u.o.c. after a convergence statement) stands for convergence that is uniform on compact sets, for elements of D([0,∞),R) or its N-times product D^N([0,∞),R). For functions with a bounded domain A ⊂ R, the u.o.c. convergence means uniform convergence.

We reserve the symbol ⇒ for weak convergence of elements in the space D([0,∞),R); the latter is the space of RCLL functions taking values in the set R of real numbers, extended to include the two “infinite numbers” +∞ and −∞ (with the natural topology on R). If h,g ∈ D([0,∞),R), then h ⇒ g means h(t) → g(t) for every t > 0 where g is continuous. (Convergence at t = 0 is not required.) We will not need any characterization of the topology on D([0,∞),R) beyond the definition of convergence given earlier.

3. THE MODEL

We consider a queuing system with a finite number I of customer types and a finite number J of flexible servers. For notational convenience, we use the symbol I also for the set of types {1,…,I}. Similarly, J also denotes the set of servers {1,…,J}.

The arrival process for each type i ∈ I is a renewal process with the time (from the initial time 0) until the first arrival being u_i(0), and the rest of the interarrival times being an independent and identically distributed (i.i.d.) sequence u_i(n),n = 1,2,…. Let λ_i = 1/E [u_i(1)] > 0 denote the arrival rate for type i and α_i² = Var[u_i(1)]. The service times of type i customers by server j ∈ J form an i.i.d. sequence v_ij(n),n = 1,2,…. Let μ_ij = 1/E [v_ij(1)] < ∞ and β_i² = Var[v_ij(1)]. The convention μ_ij = 0 is used when server j cannot serve type i. All arrival and service processes are assumed mutually independent.

A version of such a flexible (parallel) server model, which received most attention in the previous work (see [3,8,9,12,20] and references therein) is the input-queued model. In the input-queued model, customers of each type i that await service are waiting in a separate “input” queue i of infinite capacity. This, in particular, means that customers do not have to be assigned to the servers while waiting in the input queue; such server assignment is (irreversibly) done when the customer is “pulled” for service by one of the servers.

In this article, we concentrate on a different—output-queued (OQ)—model, satisfying the following (additional) immediate routing (IR) condition:

Each new customer arriving in the system must be assigned to one of the servers j immediately upon arrival, and after that, the customer can only be served by the server to which it is assigned.

A natural way to interpret the IR condition (and this interpretation is in fact the main motivation for the OQ model) is that, upon arrival, each new customer must be routed to one of the servers or, more precisely, into the “output” queue associated with (or “located at”) that server.

Remark 1: It should be clear that the IR condition defines the only difference between an OQ system and the corresponding IQ system. Therefore, in general, the class of controls (or service disciplines) for an OQ system is a strict subset of that for the corresponding IQ system. For example, the Gcμ discipline for IQ systems, studied in [12], does not satisfy the IR condition and, consequently, is not a valid discipline for OQ systems.

A service discipline in an output-queued system consists of two parts: routing (server assignment) algorithm and scheduling rule employed by each server (and, generally speaking, depending on the server) to determine which customer to serve from its queue (i.e., among the customers assigned to it).

We will consider the class of service disciplines satisfying (in addition to IR) the following condition on the routing algorithm:

(d0) The realizations of a customer's service requirements are not known at the time when routing decision (server assignment) for this customer is made. (The distributions of the service requirements at different servers are known.)

Sometimes, but not always, we will further restrict the class of service disciplines by imposing the following conditions on the server scheduling rules:

(d1) Scheduling rule of each server is nonpreemptive within each customer type; namely, a server cannot take for service a new customer of type i if it already has another type i customer “in service” (with both elapsed and residual service times being nonzero). Consequently, at any given time, a server cannot have in service more than one customer of any given type.

(d2) A server does not “know” the realization of a customer service time before the customer service starts.

Note that conditions (d1) and (d2) do allow a server idling (even if it has customers in service) or preemption of service of one customer by another customer but of a different type. They also allow server-sharing by several customers but, again, each of a different type.

Remark 2: Note that the class of IQ-system service disciplines, satisfying conditions (d0)–(d2) (but not IR), is well defined, and it, first, belongs to the class of disciplines studied in [12] and, second, contains the Gcμ discipline (see [12]). Moreover, this class is obviously a superset of the above-defined class of OQ-system disciplines satisfying (d0)–(d2) and IR.

4. THE MinDrift ROUTING RULE

4.1. Notation

Let U_j(t) denote the (unfinished) work of server j at time t; namely the total amount of unfinished processing time of all customers of all types present in server j queue at time t. We denote by Q_ij(t) the number of type i customers in queue j at time t, including those customers whose service is already started but not yet completed. The quantity

we will call Q-estimated (unfinished) work of server j. Finally, by Q_i(t) we will denote the total number of type i customers in the system at time t. In the OQ system, we always have

but we note that Q_i(t) is well defined for both the OQ and IQ systems.

4.2. MinDrift Rule Definition

Suppose that for each server j, a cost function C_j(ζ), ζ ≥ 0, is given. Assume that the cost functions have the following properties:

C_j(·) is continuous strictly increasing convex, with C_j(0) = 0.

Moreover, the first derivative C_j′(·) is continuous strictly increasing, with C_j′(0) = 0.

Finally, the second derivative C_j′′(·) is strictly positive continuous in the open interval (0,∞), with C_j′′(0) = lim_ζ↓0 C_j′′(ζ) ≥ 0, where C_j′′(0) is either finite or is +∞.

The MinDrift rule routes (assigns) customers to the servers as follows. When a new customer of type i arrives in the system, it is routed to a server j such that

Ties are broken arbitrarily; for example, in favor of the smallest index j. Also, by convention, a type i customer can never be routed to a server j if μ_ij = 0. (Throughout this article, we also adopt a related convention that any expression involving division by μ_ij holds under the additional assumption that μ_ij > 0, even if we do not state this explicitly.)

Defined by (2) is the basic version of the MinDrift rule; we will refer to it as a MinDrift(U) rule.

A version of MinDrift rule, with U_j in (2) replaced by ^qU_j, will be called MinDrift(Q) rule; namely the MinDrift(Q) rule routes an arriving type i customer to a server j such that

4.3. Nature of the MinDrift Rule

The nature of the MinDrift rule is simple—it “myopically” (or “greedily”) tries to minimize the average drift of the aggregate cost function [sum ]_j C_j(U_j(t)). Indeed, C_j′(U_j(t))/μ_ij (see (2)) approximates the expected increment of the aggregate cost function, caused by routing one type i customer (arrived at time t) to server j; therefore, by (2), MinDrift(U) routes new arrivals in the way such that the (approximate) expected increment of [sum ]_j C_j(U_j(t)) is minimized. In other words, MinDrift(U) routing tries to minimize the average rate of increase of [sum ]_j C_j(U_j(t)), due to placement of new work (or load) to the servers. Note that in the OQ system, the “best” a service discipline can do to maximize the rate at which [sum ]_j C_j(U_j(t)) is decremented due to processing of the unfinished work, is to never idle servers when they have work to do. Thus, the MinDrift(U) routing rule (in conjunction with arbitrary work-conserving scheduling rules at the servers) strives to minimize the average drift of the aggregate cost.

The MinDrift(Q) rule is based on the same principle as MinDrift(U), except that instead of using the exact values U_j of unfinished work (which may not be available in many applications), it uses their (estimated) average values ^qU_j (which may be more readily available). As we will demonstrate, in the heavy traffic asymptotic regime we consider, the two versions MinDrift(U) and MinDrift(Q) of the rule are in a certain sense “indistinguishable,” under nonrestrictive additional conditions.

The Gcμ scheduling rule for IQ systems, studied in [12], is the rule that myopically tries to minimize the drift of the aggregate cost function [sum ]_i C_i(Q_i(t)) of the queue lengths Q_i, with C_i(·) being cost functions having the same properties as functions C_j(·) defined earlier. Consequently, Gcμ is such that a server j always tries to serve a queue

thus maximizing the average rate at which [sum ]_i C_i(Q_i(t)) is decreased due to departures of served customers. The Gcμ does not—and cannot—exercise any control over the rate of increase of [sum ]_i C_i(Q_i(t)) due to new arrivals. Therefore, although both Gcμ (in an IQ system) and MinDrift (in an OQ system) strive to minimize drifts of certain cost functions, they differ in that they control different system state variables: Gcμ controls the rates at which queue lengths Q_i are depleted due to service, and MinDrift controls the rates at which unfinished works U_j are increased due to new arrivals.

4.4. Examples

Consider a special case when the cost functions have the form C_j(ζ) = γ_jζ^η+1, with some fixed η > 0 and γ_j > 0. Then the MinDrift(U) becomes the rule routing an arriving type i customer to a server

and similarly for MinDrift(Q).

Consider a special case of the system such that, for any given pair (ij), we have either μ_ij = 0 or μ_ij = μ_j > 0. In other words, the system is such that each server j has a (depending on j) subset of types i that it can serve, but the average service rates of all types within this subset are the same and equal to μ_j (and the server cannot serve at all any types i outside the subset). For this system, the MinDrift(Q) version of (4), with η = 1, becomes

Since parameters γ_j > 0 can be set arbitrarily, we see from (5) that, for this special system, such “popular” routing rules as “Join a server j with the shortest queue,”

and “Join a server j with the smallest expected unfinished work,”

are special cases of MinDrift(Q). (We remind the reader that, in both cases, routing customers to servers where they cannot be served at all is prohibited. Also, note that if we further assume that each server employs first-in-first-out (FIFO) scheduling, then rule (7) is equivalent to the “Join the shortest expected delay” routing rule.)

It should be clear that in a general system, where service rates μ_ij depend on both i and j more generally than in the special system described earlier, the Join-shortest-queue and Join-smallest-expected-unfinished-work routing rules are not special cases of MinDrift. Consequently, the heavy traffic optimality properties (which we prove in this article for MinDrift) may not (and typically do not) hold for these rules.

5. COMPLETE RESOURCE POOLING CONDITION

In this section, we present the definition of the complete resource pooling (CRP) condition and related notions and results, which are “oriented toward” the analysis of the IQ model and basically follow those in [12]. However, the development in this section is more general than that in Section 5 of [12]. In particular, we consider the notion of First-Order-CRP (FO-CRP) (which is a weaker form of CRP) and prove some additional properties related to this notion. (The results of this section provide a “reference point” for the next section, which gives an equivalent characterization of FO-CRP and CRP conditions “in terms of” the OQ model.)

Consider an I × J matrix φ = {φ_ij, i ∈ I, j ∈ J}, with all φ_ij ≥ 0. Each element φ_ij can be interpreted as the average rate at which server j time is allocated to the service of type i customers, in the long run. We do not call elements φ_ij “fractions” of server j time, because, for the reasons which will become clear in Section 6, it will be convenient for us not to assume a priori that [sum ]_i φ_ij ≤ 1, or even that φ_ij ≤ 1.

With a given φ we associate the vector μ(φ) = (μ₁(φ),…,μ_I(φ)), whose coordinates are

this is the vector of mean long-run service rates provided to the types i ∈ I, if each server j allocates its time to serving type i at the average rate φ_ij.

Consider also a different vector-function of a matrix φ; namely, let the vector ρ(φ) = (ρ₁(φ),…,ρ_J(φ)) ∈ R^J be defined as

Each component ρ_j(φ) is naturally interpreted as the total “utilization” of server j, given the average rates at which its time is allocated to service of different types i are given by φ_ij. (Again, we do not assume a priori that ρ_j(φ) ≤ 1.)

Definition 1: We define

to be the set of μ(φ) corresponding to all possible φ, satisfying the condition

Further, let

denote the set of all maximal elements μ ∈

such that μ ∈ R₊₊^I.

is maximal if

implies ζ = μ.)

Note that

is a bounded convex polyhedron in R₊^I. We assume that

is nondegenerate (i.e., has dimension I), which is equivalent to assuming that each customer type i can be served at nonzero rate μ_ij by at least one server j. The set

is in fact the closure of our system's stability region

, which is the set of arrival rate vectors λ = (λ₁,…,λ_I) such that λ < μ(φ) for some φ satisfying (10) (cf. [1,2,6,13,14,15]).

Definition 2: We say that the condition of FO-CRP holds for a vector λ if λ lies within the interior of one of the ((I − 1)-dimensional) outer faces of

and λ ∈

. If, in addition, the matrix φ such that λ = μ(φ) and (10) hold is unique, then we say that the CRP condition holds.

Remark: The CRP condition given above is the same as in [12], and it is equivalent to that introduced earlier in [9,20] (see Assumption 3.4, Thm. 5.3 and Cor. 5.4 in [20] for a summary). The fact of equivalence will, in particular, follow from the results of this section.

When the FO-CRP condition holds, let us denote by ν = (ν₁,…,ν_I) the (unique up to a scaling) “outer” normal vector to the polyhedron

at the point λ. Note that ν ∈ R₊₊^I. (Otherwise, if some ν_i ≤ 0, a small increase of the component λ_i would produce a vector λ′ ≥ λ, λ′ ≠ λ, and such that

—a contradiction to the maximality of λ.) For concreteness, we use the normal vector ν*, which is the vector defined uniquely by the additional requirement that ∥ν*∥ = 1. The components ν_i* of the vector ν* are sometimes called the workload contributions of customers of the different types i (see [9,12,20]); in this article, we will call them customer workload contributions, to make a distinction from the server workload contributions introduced in Section 6.

The FO-CRP condition for λ implies, in particular, that

this in turn implies that, for any matrix φ such that (10) holds and λ = μ(φ) (in fact, the equality in (10) must hold); namely, we have

Given λ satisfying the FO-CRP condition, for each j ∈ J let us denote

Any pair (ij) such that i ∈ I_j is called basic activity; therefore, I_j indicates the set of basic activities for server j. It is easy to see from (11) that, for any φ satisfying (12), φ_ij > 0 implies i ∈ I_j.

Lemma 1: If λ satisfies the FO-CRP condition, then the corresponding graph

with nodes being type i and servers type j and arcs being basic activities is connected.

Proof: Suppose not. Consider any breakdown of the graph

into two components,

, disconnected from each other. For m = 1,2, denote by I^(m) and J^(m) the set of types and servers, respectively, within the component

. By our construction, for any m = 1,2, j ∈ J^(m) implies I_j ⊆ I^(m).

Consider any φ satisfying (12). Recall that φ_ij > 0 implies i ∈ I_j. Let us fix a small δ > 0, and consider vector ν** obtained from ν* by the following modification: ν_i** = ν_i*(1 + δ) if i ∈ I⁽¹⁾, and ν_i** = ν_i* if i ∈ I⁽²⁾. Since there is no arc connecting

, if δ is small enough, then φ solves the problem

as well as the problem in the right-hand side (RHS) of (11). In other words, λ = μ(φ) solves the problem

, as well as

. This means that ν** is a normal (different from ν*) to the boundary of

at point λ—a contradiction to the FO-CRP condition. █

Now, with any matrix φ let us associate graph

with nodes being types i and servers j, and arcs (ij) corresponding to pairs (ij) with φ_ij > 0.

Lemma 2:

Proof:

(i) Consider arbitrary φ′ satisfying ρ(φ′) = 1 and such that φ_ij′ > 0 if and only if i ∈ I_j. Note that ν*·μ(φ′) = ν*·λ, because the condition on φ′ guarantees that φ′ solves the problem in the RHS of (11). Let

where 0 < δ < 1 is fixed. We have

and we can always choose δ to be small enough so that λ′′ lies in the interior of the same face of

as λ does. Then there exists φ′′ such that λ′′ = μ(φ′′), ρ(φ′′) = 1, and φ_ij′′ > 0 implies i ∈ I_j. It is easy to verify directly that φ = (1 − δ)φ′′ + δφ′ is a matrix with the properties we seek.

(ii) Necessity follows from (i). Let us prove sufficiency. Let φ be a matrix such that (12) holds,

, and the graph

is connected. Since λ ∈

, there exists an outer normal vector ν* to the boundary of

at point λ, and it is such that ν* ≠ 0, ν* ∈ R₊^I. Consequently, φ solves the problem (in the RHS of) (11). From this, we conclude that ν* ∈ R₊₊^I; otherwise (if ν_i* = 0 for some i), φ could not be a solution to (11), since

is connected. Finally, the normal ν* must be unique (up to scaling). Indeed, consider any other vector ν** ∈ R₊₊^I, which is not a scaled version of ν*; that is, max_i ν_i**/ν_i* > min_i ν_i**/ν_i*. Then connectedness of

easily implies that since φ solves (11), it cannot possibly solve (11) with ν* replaced by ν**. Therefore, ν** cannot be a normal to

at point λ.

(iii) The definition of CRP and statements (i) and (ii) of the lemma immediately apply the uniqueness of φ satisfying (12)—the fact that

and that this graph is connected. It remains to show that

must be a tree. Suppose not. Let us pick any cycle on this graph. It is easy to see that we can “perturb” the (strictly positive) elements φ_ij along the arcs of the cycle so as to produce a matrix φ′ ≠ φ such that μ(φ′) = μ(φ) = λ and ρ(φ′) = 1, a contradiction to the uniqueness of φ. █

Remark: Just as in the case of the Gcμ scheduling rule, studied for an IQ model in [12], we emphasize here that the notion of a basic activity is not utilized in any way (neither explicit nor implicit) by the MinDrift routing algorithm. (The algorithm need not know which activities are basic.) It is only used as a tool for the analysis of the algorithm. Similarly, the algorithm need not know the values of workload contributions.

6. EQUIVALENT CHARACTERIZATION OF THE CRP CONDITION IN TERMS OF THE OQ MODEL

In this section, we give an equivalent (“dual”) characterization of the CRP condition and introduce notions and results that will be used in the analysis of our OQ-model.

First, let us give a somewhat different (although closely related) interpretation of the matrix φ and functions μ(φ) and ρ(φ), defined in Section 5. Suppose a matrix φ is given, and assume that customers of a type i arrive (routed) to a server j at the average rate μ_ijφ_ij. Then μ_i(φ), i ∈ I, is the total average rate at which type i customers arrive in the system, and

is the average rate at which the work (i.e., the required amount of processing time) arrives (routed) to server j. (We used the convention that (μ_ijφ_ij)μ_ij⁻¹ = 0 if μ_ij = 0.)

Definition 3: We define the server utilization region

to be the set of all possible values of ρ(φ) with φ satisfying the condition

Further, let

denote the set of all minimal elements

such that ρ ∈ R₊₊^J. (

is minimal if ρ ≥ ζ ∈ K implies ζ = ρ.)

Region

is a convex polyhedron in R₊^J, and it is nondegenerate (i.e., has dimension J) as long as

is nondegenerate. Note that

is unbounded, but it is, of course, “bounded below,” say by 0, since it lies in the positive orthant.

Lemma 3:

(i) The FO-CRP condition for a fixed vector λ holds if and only if the following is true:

(a) Vector 1 ∈ R^J lies within the interior of one of the ((J − 1)-dimensional) faces of

(b)

(ii) When the FO-CRP condition for λ (or, equivalently, (i)(a) and (i)(b)) holds, then the unique (up to a scaling by positive constant) outer normal vector −α* to the polyhedron

at the point 1 is such that α* = (α₁*,…,α_J*) ∈ R₊₊^J, and α* is related to the vector ν* as follows:

In addition:

(c) i ∈ I_j (i.e., activity (ij) is basic) if and only if j ∈ J_i, where

(d) Any matrix φ satisfying ρ(φ) = 1 and (13), in fact, satisfies (12).

(e) We have

(iii) The CRP condition for a fixed vector λ holds if and only if (i)(a), (i)(b), and the following property hold:

(f) the matrix φ satisfying ρ(φ) = 1 and (13) is unique.

When CRP does hold, the matrix φ is in fact the unique solution of (12).

Proof:

(i) Let us prove the necessity of (a) and (b). Consider the vector α* defined by (14). (Note that α* ∈ R₊₊^J.) Then (15) holds. Indeed, for a fixed i, we have α_j* = μ_ijν_i* if (ij) is basic, and α_j* > μ_ijν_i* otherwise. (Incidentally, this means that i ∈ I_j is equivalent to j ∈ J_i.)

Let us choose any matrix φ that solves (12) and such that

. (Recall that graph

is connected.) Then since this φ is such that φ_ij > 0 implies j ∈ J_i, it is easy to observe that φ solves the problem

or, equivalently, 1 solves the problem

(Compare problems (18) and (19) to problem (11).) This means, in particular, that 1 is a minimal element of

, and α* is a normal to the region

at point 1. Since

is connected, it is easy to see from (18) that ρ = 1 could not possibly solve the problem (19) with α* replaced by any other nonzero vector α** ∈ R₊^J, unless α** = cα* for some c > 0. This completes the proof of necessity of (a) and (b).

The sufficiency of (a) and (b) follows simply by the symmetry between the definitions of the FO-CRP condition (for λ) and conditions (a) and (b) (expressed in terms of vector 1); namely if for the vector −1 and the region

we define a condition analogous to FO-CRP for λ and region

, this will be exactly conditions (a) and (b). Thus, from this condition ((a) and (b)) we can obtain a condition analogous to (a) and (b), but with

replaced by

, respectively; this latter condition is exactly our original FO-CRP.

(ii) As part of the proof of (i), we already proved all the properties stated in (ii), except (d) and (e). If (d) would not hold, then λ could not be a maximal element of

. Property (e) follows from the fact that any matrix φ, chosen as in the proof of (i), satisfies (12) and solves both problems (11) and (18).

(iii) This follows from (i), (ii), and the definition of CRP. █

When the FO-CRP condition holds, the components α_j* of the vector α* we will call server workload contributions of different servers j.

7. HEAVY TRAFFIC REGIME

In this section, we introduce the notion of a sequence of queuing systems in heavy traffic. Suppose a vector λ satisfying the CRP condition is fixed. Associated with this λ are the unique matrix φ satisfying (12) and the corresponding normal vectors ν* ∈ R^I and α* ∈ R^J. (We remark that all of the definitions and facts in this section are valid when the weaker FO-CRP condition holds, with arbitrary fixed φ satisfying (12). However, for our main results in Section 8, the stronger CRP condition is essential.)

The quantity

we will call the customer workload of the system. The customer workload process X(·) is of primary interest in the analysis of the IQ model [3,8,9,12,20].

For the OQ model, we define a different (although closely related, in the sense specified later) notion of server workload:

In addition, we define the Q-estimated server workload as

Informally speaking, for a service discipline satisfying constraints (d0)–(d2), ^qX(t) is a “good” (asymptotically exact) estimate of the server workload ^uX(t).

Since for any pair of i ∈ I and j ∈ J the inequality α_j* /μ_ij ≥ ν_i* holds, we observe that the Q-estimated server workload cannot be less than customer workload:

We also have the following inequality, which we record for future reference:

with

We now consider a sequence of queuing systems, indexed by

, where r_n > 0 for all n and r_n ↑ ∞ as n → ∞. (Hereafter in this article, r → ∞ means that r goes to infinity along the sequence

or some subsequence of

; the choice of the subsequence will be either explicit or clear from the context.) Each system

has, as earlier, I customer types and J servers. The primitives and the processes corresponding to a system

will be appended with a superscript r.

Assume that for each type i, the mean arrival rate λ_i^r = 1/E [u_i^r(1)] is such that

where b_i ∈ R is a fixed constant. Assume also convergence of the variance; that is,

In addition, we make the following technical assumption, needed, in particular, to apply Bramson's weak law estimates [4] (and establish (75) later): Uniformly over i and r,

where η(·) is a fixed function and η(x) → 0 as x → ∞.

For the initial interarrival times, we assume that for each i,

Let F_i^r(t), t ≥ 0, denote the number of type i customers that arrived in the system by time t, excluding “initial” customers present at time 0. Assumptions (23), (24), and (25) imply a functional central limit theorem (FCLT) for these arrival processes:

where σ_i² = λ_i³α_i², B(·) is a standard (zero drift, unit variance) Brownian motion, and

denotes convergence in distribution (for processes in the standard Skorohod space of RCLL functions).

The service time distributions do not change with the parameter r. (This in particular means that the condition analogous to (25) trivially holds for the service times v_i,j^r(1), uniformly on (i,j) and r.) Let us denote by

the total amount of work (i.e., total service time) brought to server j by the first l (newly arriving) type i customers routed to it. We extend the domain of ^ΣV_ij^r(·) to all real nonnegative t ≥ 0 by adopting the convention that ^ΣV_ij^r(t) = ^ΣV_ij^r([lfloor ]t[rfloor ]). (We will use the same domain extension convention throughout the article for other functions, which are originally defined on the integers, as well.) For ^ΣV_ij^r, we have the following FCLT:

For each i ∈ I, let us fix a set of integer-valued nondecreasing nonnegative functions (^sN_ij(l), l = 0,1,2,…), j ∈ J, satisfying the following conditions:

The value of ^sN_ij(l) is interpreted as the number of type i customers routed to server j, out of the first l type i customers arriving in the system. Then it is clear that for each flow i, the functions ^sN_ij(·) define a fixed (“static”) pattern of routing customers to the servers in J_i, such that, for any l, the fractions of customers routed to different servers j ∈ J_i closely track the values μ_ijφ_ij /λ_i; recall that [sum ]_j μ_ijφ_ij /λ_i = 1. (The MinDrift rule does not require any knowledge of this static routing pattern; it is only used as a tool for the analysis!) Such functions can be, for example, constructed recursively as follows. We set ^sN_ij(0) = 0 for all j ∈ J_i. For each l = 1,2,…, we set ^sN_ij(l) = ^sN_ij(l − 1) + 1 for one of the j ∈ J_i with the smallest value of ^sN_ij(l − 1) − (μ_ijφ_ij /λ_i)l, and ^sN_ij(l) = ^sN_ij(l − 1) for all other j. (The “ties” between j are broken arbitrarily, for example, in favor of the smallest one.)

For i ∈ I, let us denote by

the total amount of server workload brought to the system by the new arrivals of flow i by time t ≥ 0 assuming the arrivals would be routed according to the (fixed) functions ^sN_ij(·). From (26)–(30) we obtain the following FCLT for the sequence of processes A_i^r(·):

where

From (31) and (17) we obtain the following FCLT for the sequence of processes [sum ]_i A_i^r(·):

where

8. MAIN RESULTS

For each value of the (scaling) parameter

, let us consider the following processes. Let U^r(·) and ^qU^r(·) be the (vector) processes, representing (unfinished) work and Q-estimated (unfinished) work, respectively, at different servers; let ^uX^r(·), ^qX^r(·), and X^r(·) denote the scalar processes representing server workload, Q-estimated server workload, and customer workload, respectively.

Assume that in a system with index

, each server j, at any time t, incurs a holding cost at the (instantaneous) rate of

where C_j(·) are fixed convex increasing functions, with the additional properties described in Section 4.

Note that in our asymptotic regime the cost function is “rescaled” as the parameter r changes. (In other words, in a system with index r, the holding cost rate corresponding to the unfinished work U_j^r(t) is C_j(U_j^r(t)/r) instead of C_j(U_j^r(t)).) Notice, however, that in the special case (described in Section 4.4) when C_j(ζ) = γ_jζ^η+1, with some fixed η > 0 and γ_j > 0, the form of the corresponding MinDrift rule does not change with r. Indeed, in this case, replacing C_j′(U_j^r(t)) in (2) with C_j′(U_j^r(t)/r)/r simply does not change the routing rule.

For our main results, we need the notion of a fixed point. A vector ^○u ∈ R₊^J will be called a fixed point if

for some constant c ≥ 0. If we recall that each derivative C_j′(·) is continuous strictly increasing with C_j′(0) = 0, one deduces the following:

A fixed point ^○u corresponding to each c ≥ 0 exists and is unique. Moreover, ^○u = 0 for c = 0, and ^○u ∈ R₊₊^I (i.e., has all components strictly positive) for any c > 0.

Thus, the set of fixed points forms a one-dimensional manifold, which can be parameterized, for example, by the corresponding server workload values α*·^○u. In addition, it is easy to verify the following property:

A fixed point ^○u is the unique vector that minimizes [sum ]_j C_j(u_j) among all vectors u ∈ R₊^J with the same server workload (i.e., satisfying condition α*·u = α*·^○u).

Indeed, if ^○u = 0, the property is trivial. If ^○u ∈ R₊₊^J, condition (35) implies that the (Lagrangian) function

has zero gradient (with respect to u) at point ^○u. Since this Lagrangian is strictly convex in R₊^J, it is minimized by ^○u. Then the desired property follows from the Kuhn–Tucker theorem.

Let us define the diffusion scaling operator

, which acts on a scalar function Ξ = (Ξ(t), t ≥ 0) as

and is applied to vertor-functions componentwise.

Let us consider the following diffusion-scaled processes:

8.1. Optimality of the MinDrift(U) Rule

Assume that the initial (scaled) amounts of unfinished work are deterministic and converging:

where

is a fixed point, as defined earlier. Consequently,

Let us define the following one-dimensional reflected Brownian motion

where B(·) is a standard Brownian motion,

and the drift a and diffusion coefficient σ are given in (34) and (32), respectively.

Theorem 1: Consider the sequence of queueing systems in heavy traffic, as introduced in Section 7. Assume initial condition (37). Let

be a reflected Brownian motion defined by (38) and (39).

(i) Suppose that the service discipline is such that the routing rule is MinDrift(U) with cost functions C_i^r(·), for each value of the parameter r, and each server employs an arbitrary work-conserving scheduling rule (namely the server is not allowed to idle when there is unfinished work in its queue). Then, as r → ∞,

where for each t ≥ 0, the vector

is the fixed point that is (uniquely) determined by

(ii) The service discipline defined in (i) is asymptotically optimal within the class of service disciplines satisfying condition (d0) in that it minimizes the server workload and the unfinished work cost rate at all times. More precisely, let

be the scaled unfinished work and server workload processes corresponding to an arbitrary service discipline G (and appropriately constructed on a common probability space with the sequence of processes defined in (i)). Then, with probability 1, for any time t ≥ 0,

and

As a corollary, with probability 1, for any T > 0,

The proof of Theorem 1 is the subject of Sections 9 and 10.

8.2. Optimality of the MinDrift(Q) Rule

Assume that, for each r, the initial state of the system at time 0 is deterministic and it conforms to conditions (d1) and (d2) on a service discipline (which are assumed in Theorem 2 below). In particular, for each pair of i and j, server j has in its queue at most one customer of type i whose service has already started, and the realizations of service times of the customers whose service has not yet started are not known to the server. For the initial residual service times v_i,j^r(0) (if any) of the customers whose service has already started, we assume

Finally, assume that the initial (scaled) amounts of Q-estimated unfinished work are converging:

where

is a fixed point, as defined earlier.

It follows from the above initial conditions that

and, in addition, with probability 1,

For the fixed

, we consider a one-dimensional reflected Brownian motion

, defined in (38).

Theorem 2: Consider the sequence of queuing systems in heavy traffic, as introduced in Section 7, and with the initial conditions described in Section 8.2.

(i) Suppose that the service discipline is such that the routing rule is MinDrift(Q) with cost functions C_j^r(·), for each value of the parameter r, and each server employs an arbitrary work-conserving scheduling rule satisfying conditions (d1) and (d2). Then, as r → ∞,

where, for each t ≥ 0, the vector

is the fixed point that is (uniquely) determined by

(ii) The service discipline defined in (i) is asymptotically optimal within the class of service disciplines satisfying conditions (d0)–(d2) in that it minimizes both the server workload and the Q-estimated server workload and the unfinished work cost rate at all times. More precisely, let

be the scaled unfinished work, Q-estimated unfinished work, server workload, and Q-estimated server workload processes, respectively, corresponding to an arbitrary service discipline G satisfying (d0)–(d2) (and appropriately constructed on a common probability space with the sequence of processes in (i)). Then, with probability 1, for any time t ≥ 0,

and

As a corollary, with probability 1, for any T > 0,

and

The proof of Theorem 2 is essentially a slightly modified (and extended) version of that of Theorem 1. It is outlined in Section 11.

8.3. Customer Workload Minimization Under the MinDrift(Q) Rule

Suppose that we are within the conditions of Theorem 2. For the initial customer workloads

, we “automatically” (by (20) and (44)) have

Suppose, in addition, that in fact,

converges to the same limit as

which is equivalent to the condition that

for every nonbasic activity (ij).

Now, any service discipline in the OQ system, satisfying conditions (d0)–(d2), is within the class of disciplines for the corresponding IQ system studied in [12] (see Remark 2 in Section 3 of this article). In particular, Theorem 1 in [12] establishes that reflected Brownian motion (RBM) with exactly the same distribution as the RBM

(defined in this article by (38)) is in fact the stochastic lower bound of any limit of the customer workload process

, under any service discipline satisfying conditions (d0)–(d2) (but not necessarily IR), which includes service disciplines for both OQ and IQ systems.

However, from (20) we have a pathwise relation

and, by Theorem 2(i), under the MinDrift(Q) rule,

. Thus,

is both the lower and upper (stochastic) bounds of

and, therefore,

. We have proved the following result, which basically says that the MinDrift(Q) rule minimizes customer workload among virtually all service disciplines in either the OQ or IQ system.

Theorem 3: Suppose that the conditions of Theorem 2 and, in addition, condition (50) hold.

(i) For the service discipline described in Theorem 2(i), we, in addition, have

(ii) The service discipline described in Theorem 2(i) asymptotically minimizes customer workload among all service disciplines, satisfying conditions (d0)–(d2), in either the OQ or IQ system; namely, the customer workload process

under any such discipline G can be constructed on a common probability space with the RBM

, so that, with probability 1, for any time t ≥ 0,

8.4. A Necessary Condition for Workload Minimization Under Any Service Discipline: Vanishing Nonbasic Queues

Theorems 2 and 3 show that, roughly speaking, the MinDrift(Q) rule minimizes both server and customer workload in the heavy traffic limit. Theorem 4 demonstrates that (either customer or server) workload in an OQ system can be minimized by some discipline only if, under this discipline, the (scaled) queue lengths

corresponding to nonbasic activities (ij) vanish in the limit.

Theorem 4: Suppose that the conditions of Theorem 2 and, in addition, condition (50) hold. Suppose that under some service discipline in the OQ system, satisfying conditions (d0)–(d2), either (52) or

holds. Then, for any t ≥ 0,

The proof is presented in Section 12.

9. FLUID SAMPLE PATHS UNDER THE MinDrift RULE

9.1. Fluid Sample Paths: Definition and Basic Properties

In this section, we study the sequence of processes introduced in the previous section under the fluid (or “law of large numbers”) scaling and under the MinDrift rule. More precisely, we need to consider only sample paths of the processes under this scaling and then their limits, which we formally define here and call fluid sample paths (FSPs). The key property of FSPs that must be established (in Theorem 5) is that as time increases to infinity, the queue length vector converges to a fixed point. This attraction property is used to prove (in Section 10) the state space collapse property (i.e., the property that the limit of the sequence of diffusion scaled processes is a process “living” on the manifold of fixed points).

Throughout Section 9 we assume the CRP condition. However, all definitions and results of this section hold under the weaker FO-CRP condition, with arbitrary fixed φ satisfying (12) and

; in other words, the FSP definition and key properties do not require a solution φ of (12) to be unique.

First, we introduce some additional (random) functions, associated with the process for each value of the scaling parameter r. (The functions F_i^r(t), Q_i^r(t), and X^r(t), have already been defined earlier.) Denote by G_ij^r(t) the amount of time within [0,t] that server j was serving type i customers. For each pair (i,j) we define N_ij^r(n) as the number of type i arrivals actually routed to server j, out of the first n new type i arrivals, and denote

Here, for any pair (i,j), ^sN_ij^r(·) ≡ ^sN_ij(·) for all r, where ^sN_ij(·) are nonrandom functions fixed earlier and satisfying conditions (28)–(30).

We define the (server workload) regulation process as follows:

Function Y_idle^r is the regulation component due to physical idleness of the servers, Y_route^r is the regulation component due to (possible) routing of customers to nonbasic servers, and functions Y_route,i^r represent the contributions into Y_route^r due to different flows i ∈ I.

The regulation component Y_idle^r(t) is clearly nonnegative and nondecreasing. It is easy to verify that each regulation component Y_route,i^r(t) is also nonnegative and nondecreasing. (Also, so is, therefore, Y_route^r(t).) Moreover, Y_route,i^r(t) is a piecewise constant function, which is constant between type i customer arrivals, and jumps by the value α_j*/μ_ij − ν_i* ≥ 0 when a type i arrival is routed to server j; note that the size of the jump is strictly positive if and only if server j is nonbasic for type i. Finally, note that Y^r(t) does not increase over some time interval if and only if, during that interval, none of the servers idles and all new arrivals are routed to the corresponding basic servers.

We record the above facts (along with their obvious generalization) in the following lemma for future reference.

Lemma 4: For each value of the scaling parameter r, consider a pair of time points 0 ≤ t₁^r < t₂^r < ∞ and denote

Then B₀^r = 0 if and only if B_1,i^r = 0 for all i and B_2,j^r = 1 for all j. Also, lim_r→∞ B₀^r = 0 if and only if lim_r→∞ B_1,i^r = 0 for all i and lim_r→∞ B_2,j^r = 1 for all j.

Let us consider the process Z^r = (U^r, ^uX^r,F^r, ^ΣV^r, ^sN^r,N^r,G^r,H^r,Y^r,Y_idle^r,Y_route^r), where

For each r consider the fluid scaled process

where the fluid scaling operator Γ^r is applied componentwise and acts on a scalar function Ξ = (Ξ(t), t ≥ 0) as follows:

Definition 4: A fixed set of functions z = (u, ^ux,f, ^Σv, ^sn,n,g,h,y,y_idle,y_route) will be called a fluid sample path (FSP) if there exists a sequence

of values of r and a sequence of sample paths (of the corresponding processes) {z^r} such that, as r → ∞ along the sequence

and, in addition,

Remark: A sequence

, the existence of which is required in Definition 4, may be completely unrelated to the sequence

we introduced earlier in the definition of the heavy traffic regime.

The following lemma establishes some basic properties of FPSs. We omit the simple proof, which is a direct consequence of the definitions involved.

Lemma 5: For any FSP z, all of its component functions are Lipschitz continuous and, in addition,

Furthermore, both y(·) and ^ux(·) are nondecreasing (with y(0) = 0).

Since all component functions of an FSP are Lipschitz, they are absolutely continuous, and therefore for almost all points t ∈ R₊ (with respect to the Lebesgue measure), the following property holds:

Each component function of z has (finite) first derivative at t, and each function n_ij(·) has (finite) first derivative at λ_i t.

We refer to such time points t as regular. We adopt a convention that t = 0 is not a regular point (i.e., in the definition of regular points, we require that proper derivatives exist).

The dynamics of u(t) satisfies the following differential equation and additional conditions at every regular point t:

where the components of the J-dimensional vectors ρⁱⁿ(t) and ρ^out(t) are defined as

and for ρ^out, we have

9.2. Uniform Attraction of Fluid Sample Paths

For u ∈ R₊^J, denote

Φ(u) [esdot ] 1 − _* A(u)/*A(u) if u ≠ 0, and Φ(0) [esdot ] 0 by convention.

Consider the following functions associated with a fixed FSP. First, define

and, similarly, J_*(t) (with *A replaced by _* A). Next, introduce

and note that *u_j(t) is well defined since each function C_j′(·) is strictly increasing continuous. Let *x(t) [esdot ] α*·*u(t), where *u(t) = (*u₁(t),…,*u_J(t)), and note that ^ux(t) ≤ *x(t) for all t ≥ 0.

Finally, note that, at any time t, the following five conditions for u(t) are all equivalent:

u(t) is a fixed point.
*A(u(t)) = _* A(u(t)).
Φ(u(t)) = 0.
*x(t) = ^ux(t).
*u(t) = u(t).

The following sequence of lemmas establishes further properties of fluid sample paths, which are less obvious than the basic properties of Lemma 5. The form of the MinDrift rule is used in the proofs in an essential way.

Lemma 6: Consider a fixed FSP z. Suppose t > 0 is a regular point and u(t) ≠ 0. Then the following properties hold at this t:

(i) u_j(t) > 0 for all j ∈ J.

(ii) We have

(iii) Moreover, there exists a constant ε₁ > 0, which depends on system parameters only, such that if, in addition, u(t) is not a fixed point (i.e., *A(q(t)) > _* A(q(t))), then

Proof: Let us first prove (iii). Thus, consider regular time point t > 0 and suppose that *A(u(t)) > _* A(u(t)). The following observation is true:

Indeed, according to the MinDrift rule and Lemma 3(ii)(c), for all sufficiently large r, the prelimit path z^r is such that in a small interval [t,t + ε], ε > 0, new arriving customers of type i cannot be routed to a server j ∈ J*(t)[setmn ]J_i. This easily implies that the corresponding FSP component n_ij(·) cannot increase in a small interval to the right of λ_i t, and therefore n_ij′(λ_i t) = 0 since t is regular. Using a similar argument, it is also easy to prove the following property:

Let us denote by I*(t) the (nonempty) subset of types i such that J_i ∩ J*(t) ≠ [empty ]. Since graph

is connected, there exists at least one i ∈ I*(t) such that J_i[setmn ]J*(t) ≠ [empty ], in which case (by (65)), we have strict inequality:

If i ∈ I*(t) and J_i[setmn ]J*(t) = [empty ] (i.e., J_i ⊆ J*(t)), then

Indeed, using (64), the fact that α_j* μ_ij⁻¹ is the same across all j ∈ J_i, [sum ]_j∈J n_ij′(λ_i t) ≤ 1, and [sum ]_{j∈J_i} μ_ijφ_ij /λ_i = 1, we can write

As a corollary from (64), we also obtain the following property:

We will now show that

where ε > 0 depends only on the subset J*(t). Indeed,

and (using (68), (66), and (67)) we have

We have proved (69), with ε > 0 depending only on the subset J*(t) ⊂ J. Since there is only a finite number of subsets of J, we have proved the first inequality in (63), with some fixed ε₁ > 0.

The second inequality in (63) is proved analogously. We denote by I_*(t) the (nonempty) subset of types i such that J_i ∩ J_*(t) ≠ [empty ]. Then we use the following property (obtained using the argument analogous to that leading to (64) and (65)):

We omit details.

The proof of the nonstrict inequalities in property (ii) is a straightforward extension of the proof of (iii); namely we need to consider an additional (degenerate) case when J*(t) = J_*(t) = J. In this case, for example to prove the first inequality in (62), we observe that the nonstrict inequality (67) always applies, and (70) holds with the strict inequality replaced by a nonstrict one.

Finally, (i) is proved by contradiction. Suppose, u_j(t) = 0 for some j ∈ J. Obviously, the set of such j is exactly J_*(t). Since u(t) ≠ 0, u(t) is not a fixed point. Therefore, the second inequality in (63) should hold. However, this is impossible, because we must have u_j′(t) = 0 for all j ∈ J_*(t). Indeed, the condition u_j(t) = 0 and the existence of u_j′(t) imply that u_j′(t) = 0. (Otherwise, u_j(·) would be negative just before or right after time t.) █

Lemma 7: Consider a fixed FSP. Suppose a time interval [t₁,t₂], with 0 ≤ t₁ < t₂, is such that

Then, over [t₁,t₂], the functions *A(q(t)), _* A(q(t)), *x(t), and *u_j(t) for all j ∈ I are Lipschitz continuous. Moreover, for almost all t ∈ [t₁,t₂],

and if, in addition, *A(u(t)) > _* A(u(t)) (i.e., u(t) is not a fixed point), then

where ε₁ > 0 is defined in Lemma 6.

Proof: First, the Lipschitz continuity of each function C_j′(u_i(t)) in [t₁,t₂] follows from Lipschitz continuity of u_j(·) and the fact that, for the range of possible values of u_j(t) in [t₁,t₂], C_j′′(·) is continuous bounded away from both infinity and zero. (This is the only place where we use the assumption that the functions C_i(·) are twice continuously differentiable.)

This implies that for an arbitrary fixed subset

, the following functions are also Lipschitz continuous in [t₁,t₂]:

In particular, *A(q(t)) and _* A(q(t)) are Lipschitz, which (along with the fact that the second derivatives C_j′′(·) are bounded away from zero) implies that all *u_j(t) and *x(t) are Lipschitz. We see that almost all points t ∈ [t₁,t₂] are regular (as defined earlier) and, in addition, are such that all the max and min functions in the last display, for all (nonempty) subsets

, have derivatives. Within the present proof, let us call such points t strictly regular.

Consider an arbitrary strictly regular point t ∈ [t₁,t₂]. The proof will be complete once we prove (71) and (72) for this point t. Since t is strictly regular, the derivatives d/dt[*A(u(t))] and d/dt[C_j′(u_j(t))/α_j*] for j ∈ J*(t) are all equal. (In particular, this implies that *u_j′(t) = u_j′(t) for all j ∈ J*(t).) We cannot have d/dt[*A(u(t))] > 0 because this would imply that u_j′(t) > 0 for all j ∈ J*(t), which would contradict (62). This proves the first (and, with it, the last) inequality in (71). The second inequality in (71) is proved analogously.

We can now write

where the inequality follows from the fact that *u_j′(t) ≤ 0 for all j ∈ J (which is implied by (71)), and the second equality is because *u_j′(t) = u_j′(t) for j ∈ J*(t). In the case *A(q(t)) > _* A(q(t)), by (63), the RHS of the above display is bounded above by −ε₁, which proves (72). █

Lemma 8: Consider a fixed FSP z. Suppose u(t₁) ≠ 0 for some t₁ ≥ 0. Then u(t) has all strictly positive components (i.e., u(t) ∈ R₊₊^J) for all t > t₁. Moreover, in [t₁,∞), _* A(q(t)) is nondecreasing, and both *A(q(t)) and *x(t) are nonincreasing.

Proof: Indeed, we can always find a regular point ξ > t₁ arbitrarily close to t₁ so that u(ξ) ≠ 0. By Lemma 6, u(ξ) ∈ R₊₊^J. Then, using Lemma 7, it follows that _* A(u(t)) is nondecreasing (and *A(u(t)) and *x(t) are nonincreasing) starting from time ξ, and therefore u(t) ∈ R₊₊^J for all t ≥ ξ. Since ξ can be chosen arbitrarily close to t₁, the proof is complete. █

Lemma 9: Consider a fixed FSP z. If u(0) = 0, then u(t) = 0 for all t ≥ 0.

Proof: Suppose not. By continuity of *x(·), for an arbitrarily ε > 0, there exists time t₁ > 0 at which *x(t) reaches level ε for the first time. Of course, u(t₁) ≠ 0. By Lemma 8, *x(t) cannot increase starting at time t₁, and therefore *x(t) ≤ ε for all t ≥ 0. Since ε > 0 can be chosen arbitrarily small, *x(t) = 0, and therefore u(t) = 0 for all t ≥ 0. █

The following theorem easily follows from the lemmas presented earlier in this subsection.

Theorem 5: For any fluid sample path, Φ(u(t)) is a nonincreasing function, and the server workload ^ux(t) is a nondecreasing function. Moreover, there exist fixed constants T₁ > 0 and K ≥ 1 such that, for any FSP, u(t) reaches a fixed point ^○u within finite time ^ux(0)T₁ and then stays there, and α*·^○u ≤ ^ux(0)K.

Proof: The fact that x(t) is nondecreasing has already been established earlier.

Suppose u(0) ≠ 0. By Lemma 8, _* A(u(t)) is nondecreasing and *A(u(t)) is nonincreasing in [0,∞), and therefore Φ(u(t)) is nonincreasing. Further, by Lemma 8, u(t) ∈ R₊₊^J for all t > 0. Then, by Lemma 7, for almost all t > 0, *x(t) > ^ux(t) implies

Since *x(0) ≤ ^ux(0)[[sum ]_j α_j*]/[min_j α_j*], ^ux(t) ≤ *x(t), and ^ux(t) is nondecreasing, we immediately see that u(t) must reach a fixed point within a time proportional to ^ux(0).

Therefore, the statement of the theorem, with some fixed T₁ > 0 and K ≥ 1, holds for the FSPs with u(0) ≠ 0. By Lemma 9, it trivially holds for u(0) = 0 as well. █

For future reference, we record the following property of prelimit paths.

Lemma 10: There exists a constant ε₂ > 0 such that the following holds. For any prelimit (scaled) path u^r = (u^r(t), t ≥ 0), and 0 ≤ t₁^r < t₂^r < ∞, the property

implies that y^r(t₂^r) − y^r(t₁^r) = 0 or, equivalently, that in the (scaled) interval (t₁^r,t₂^r], all new arriving customers are routed to their corresponding basic servers and none of the servers idles.

Proof: First, since u^r(t) ≠ 0 in [t₁^r,t₂^r], none of the servers idles in this time interval. Second, a small value of Φ(u^r(t)) implies that the vector (C₁′(u₁^r(t)),…,C_J′(u₁^r(t))) is “almost proportional” to vector α*. Thus, if Φ(u^r(t)) is small, it follows from the form of the MinDrift(U) rule and Lemma 3(ii)(c) that in [t₁^r,t₂^r], a new arrival of any type i ∈ I can only be routed to a server j ∈ J_i. We omit the ε-δ formalities. █

10. PROOF OF THEOREM 1

For each

, consider the following process, obtained by diffusion scaling:

where the diffusion scaling operator

is defined in (36).

To prove the properties stated in Theorem 1, it will suffice to show that for any subsequence

there exists another subsequence

such that these properties hold when r → ∞ along

. As in [14], to do this, we will choose subsequence

and construct all processes (for all

) on the same probability space in a way such that the desired properties hold with probability 1 (or are implied by certain probability 1 properties).

Let us fix an arbitrary subsequence

of indices {r}. According to Skorohod's representation theorem (see, for example, [7]), for each i, the sequences (on r) of the processes {F_i^r} and {^ΣV_ij^r}, j ∈ J, can be constructed on a probability space such that the convergence in (31) holds u.o.c. with probability 1 (w.p.1):

where B_i is a standard Brownian motion.

We can and do assume that our underlying probability space Ω = {ω} is a direct product of the above I probability spaces. (Without loss of generality, we assume that this probability space is complete.) On this probability space the convergence (33) holds u.o.c. w.p.1 as well:

where B is a standard Brownian motion.

Now, from condition (25) and Bramson's weak law estimates ([4, Prop. 4.3]), we know that for any T₃ > 0, any ε > 0, and any i, for all large r, we have (see the proof of property (5.19) in Proposition 5.1 of [4])

(The max in (75) and (76), as well as in (76)–(78), is over integers l ∈ [0,T₃ r].) Also, using Proposition 4.2 of [4], it is easy to show (similarly to the derivation of property (71) in [14]) that for any T₃ > 0, any ε > 0, and any pair of (i,j), for all large r, we have

Estimates (75) and (76) enable us to choose a subsequence

, such that as r → ∞ along

, with probability 1, for any T₃ > 0 we have

and

Properties (77) and (78), in turn, imply the following property.

With probability 1, for any fixed T₄ > 0 and d > 0, for any (i,j), we have the following:

Uniformly on any sequence of pairs

, such that 0 ≤ t₁^r < t₂^r ≤ r²T₄, t₂^r − t₁^r ≥ rd,

Uniformly on any sequence of pairs

, such that 0 ≤ l₁^r < l₂^r ≤ r²T₄, l₂^r − l₁^r ≥ rd,

For each j ∈ J, we have

and therefore the expression for the scaled server workload can be written as

where

is the term (82),

denotes the term (83),

denotes the term (84), and

. We know from (74) that

where

B(·) is the realization of a standard Brownian motion, and the parameters a and σ are those defined in (34). (The realization

is, of course, continuous.) As seen from (85), the key step in proving Theorem 1 will be the proof of the following convergence:

In the rest of this section, we restrict ourselves to a (measurable, probability 1) subset Ω₂ ⊆ Ω of elementary outcomes ω, such that all the specified above probability 1 properties hold, when r → ∞ along

Lemma 11: Consider a fixed ω ∈ Ω₂. As r → ∞ along

, the functions

, and then

, are “asymptotically close” in the following sense. For any fixed T₄ > 0 and any fixed δ₁ > 0 and δ₂ > 0, for all sufficiently large r, uniformly on t ∈ [0,T₄],

and then

The proof of Lemma 11 is analogous to the proof of Lemma 9 in [12]. The key observation here (which follows from (79) and (80)) is that, for fixed T₄ > 0 and (arbitrarily small) d > 0, if t ∈ [0,T₄] and |H_ij^r(F_i^r(r²t))| ≥ rd, then for all sufficiently large r, the ratio

is close to unity. We do not present the details. We note that (analogous to the situation with Lemma 9 in [12]) Lemma 11 applies to any service discipline satisfying condition (d0), and the uniqueness of φ (in the CRP condition) is used in the proof of Lemma 11 in an essential way.

It follows from Lemma 11 that to prove (86), it suffices to prove

because

is bounded on finite intervals.

Since regulation

is a nondecreasing function (for any r), for any fixed ω ∈ Ω₂, from any subsequence

(which may depend on ω!), it is always possible to find a further subsequence

such that

where

is some nondecreasing RCLL function. (We will prove that this limit

is indeed the regulation of the one-dimensional Brownian motion defined earlier.) In principle,

may take the values +∞. (In other words,

.) Recall that the notation “⇒” stands for convergence at every point of continuity of the limit function except maybe the point 0.) We note that (90) implies that

and therefore

if and only if

The following lemma (and its proof) is analogous to Lemma 7 in [14] and Lemma 10 in [12]; it contains key observations used in the proof of Theorem 1. The key construction of the proof, which involves “slowing down” the diffusion scaled process to consider a family of processes on the “fluid” time scale and then expoiting the uniform attraction property of fluid sample paths, is essentially the same as that in Section 5 of [4]. This construction is central for establishing SSC in the heavy traffic asymptotic regime for multiclass queuing networks (see [4,19]).

Lemma 12: Suppose that the service discipline is such that the routing rule is MinDrift(U) and scheduling rules at the servers are work-conserving. Suppose that ω ∈ Ω₂ and a subsequence

are fixed such that, along this subsequence, (90) holds. Suppose, a sequence

is fixed such that

Let δ > 0 be fixed and

Then the following hold:

(a)

is finite in [0,t′ + δ].

(b)

does not increase in

(c) The following bound holds

with K defined in Theorem 5.

(d) For any δ′ > 0,

where

is the (unique) fixed point such that

If, in addition,

for all r, and

, where

is a fixed point (necessarily, with

), then the following hold:

Proof: The proof essentially repeats that of Lemma 10 in [12]. For completeness, and since some adjustments are required, we present it here.

Let us consider the functions of interest on the fluid time scale; namely consider earlier defined functions

, and similarly defined function w^r and other related ones.

Let us choose a fixed T > 0 as follows. Let us fix ε₃ ∈ (0,C − ε), denote

and fix arbitrary

where K and T₁ are the constants defined in Theorem 5. As seen later in the proof, C₃ will be the upper bound of

in the interval

or, equivalently, the upper bound of ^ux^r(·) in the interval

. Thus, the choice of the constant T is such that an FSP with initial server workload not exceeding C₃ will converge to a fixed point within time T.

For each integer l ∈ [0,2δr/T], consider

and similarly defined w^r,l , y^r,l , and other related functions.

Let us fix arbitrary ε₄ ∈ (0,ε₂ /2), where ε₂ is defined in Lemma 10. Then the following property holds.

Property 1: For all sufficiently large r, relation (92) below holds for all integer l ∈ [0,2δr/T], and relations (93)–(95) hold for all integer l ∈ [1,2δr/T]:

To prove Property 1, we first observe that (92) must hold for l = 0 for all large r, because otherwise we would be able to choose a subsequence of indices r along which the sequence of paths z^r,0 converges to an FSP z with ^ux(0) = C and either ^ux(ξ) > CK or ^ux(ξ) < C for some ξ ∈ [0,T], which contradicts Theorem 5. Moreover, this observation shows that in fact for all large r,

and, given our choice of the constant T,

Next, suppose Property 1 does not hold. Then we can choose an (infinite) subsequence of r such that (along this subsequence) l′ = l′(r) is well defined as the smallest l ≥ 1 such that one of the conditions (92)–(95) does not hold. (Note that, by this construction and (97), property (93) always holds for l = l′ and ξ = 0.) We will show that this construction leads to a contradiction.

Indeed, for all large r, both (92) and (93) hold for l = l′ and ξ = 0. This follows from the combination of the following facts:

Property (96).
as long as ξ₁,ξ₂ ∈ [t′ − 3δ,t′ + 3δ] ∩ R₊.
uniformly in [t′ − 3δ,t′ + 3δ] ∩ R₊.
Property (95) for each 1 ≤ l ≤ l′ − 1.
The functions
are asymptotically close (in the sense of Lemma 11).

Since (92) and (93), with l = l′ and ξ = 0, hold for large r, we see that (93) and (94) hold for l = l′ and all large r. (Otherwise, we would be able to choose a subsequence of r along which z^r,l′ converges to an FSP z, violating Theorem 5.) Similarly, the lower bound in (92) must hold (for large r) for l = l′ and ξ ∈ [0,T]. This, in conjunction with (94) and Lemma 10, means that (95) holds for l = l′ (for large r). Finally, this and the argument we already used to prove bound (92) for l = l′, ξ = 0, shows that in fact (92) holds for l = l′ and all ξ ∈ [0,T] (for large r). We have proved that (92)–(95) hold for l = l′(r) for all large r. This is a contradiction with the construction of the function l′ = l′(r), which proves Property 1.

Property 1 (namely (92)) implies that, for all large r,

Statements (a)–(c) of the lemma follow from this estimate.

To prove (d), we first notice that (a), (b), and Lemma 11 imply the following uniform convergence for the workload process:

Statement (d) then follows from Property 1, the fact that ε₄ can be chosen arbitrarily small, and convergence (98).

To prove properties (c′) and (d′), we use the same exact construction. It is easy to see that, under the additional assumptions, all conditions (92)–(95) in Property 1 hold for all integer l ∈ [0,2δr/T] (including zero). Given this, properties (c′) and (d′) are proved analogously to properties (c) and (d). We omit details. █

The rest of the proof of Theorem 1 repeats that of Theorem 1 in [14], and that of Theorem 1 in [12], virtually verbatim. We reproduce it here, with the necessary minor adjustments, for completeness.

10.1. Proof of Theorem 1(i)

To prove this part it suffices to prove the following:

Property 2: As r → ∞ (along

), for any ω ∈ Ω₂ (i.e., with probability 1), we have the following convergence:

where

is defined by (39), and

where for each

is the fixed point such that

Proof of Property 2: Let us fix ω ∈ Ω₂. As explained earlier, for an arbitrary subsequence

there exists another subsequence

such that the convergence (90) holds along this subsequence. Then, the proof of Property 2 will be complete if we can prove the following statements (for the chosen ω, with r → ∞ along

. We recall that, at this point, the function

is just some limit function—the fact that it is equal to the function defined by (39) is what needs to be proved in order to establish (100).

Step 1. The limit function

is finite everywhere in [0,∞).

Step 2. The function

is continuous, and

Step 3. If

, then t is not a point of increase of

Step 4. The function

, defined above as a limit, satisfies (39).

Step 5. Convergence (100) holds.

In this proof, we will use the convention that

. So, the case

will be viewed as a discontinuity of y (and x) at 0. Also, we will use the notation

Proof of Step 1. Suppose the statement does not hold. Denote

. The inf is attained because

is RCLL.

We choose a small δ such that δ ∈ (0,t*) if t* > 0, and arbitrary δ > 0 if t* = 0. Let us fix ε = ε(4δ,t*). Then we choose a small Δt ∈ (0,δ) and a large C such that

. We define

and choose a further subsequence of {r} such that

(We must have t′ ≤ t*, because the limit function

, and therefore

, is infinite for all t ≥ t*.) It is also easy to see (from (77)) that

The conditions of Lemma 12 are satisfied, and so y is bounded in [t′,t′ + δ] —a contradiction, since t′ + δ > t*. Step 1 has been proved.

Proof of Step 2. Suppose that the statement does not hold. The contradiction is obtained very similarly to the way it is done in the proof of Step 1. Let t* be a discontinuity point (the case t* = 0 is included) (i.e.,

). Since

is continuous,

. There are two possible cases:

Case (a). In this case, we must have t* > 0. (Indeed, by the definition of

and our conventions,

. If

, then, by Lemma 12(c′),

, which means that

, and therefore

, has no jump at 0. If

, then

.) We can always fix a small δ > 0 and small Δt ∈ (0,δ), such that t′ = t* − Δt is a point of continuity of

(and

) and

. We have convergence

(since

is continuous at t′), and by Lemma 12,

cannot increase in the interval (t′,t′ + δ] which contains t_*. So,

cannot have a jump at t_*.

Case (b). In this case, let us fix a small C > 0 and then a sufficiently small δ > 0 so that

where ε = ε(4δ,t*) and K ≥ 1 is defined in Theorem 5 (and used in Lemma 12). Then if t* > 0, we fix a small Δt such that

If t* = 0, we fix an arbitrary Δt > 0. We define

and choose a further subsequence of {r} such that

The conditions of Lemma 12 are satisfied, and so

for all t ∈ [t′,t′ + δ], which contradicts the assumption of case (b), since t* belongs to the latter interval. Step 2 has been proved.

Proof of Step 3. Let t* ≥ 0 be such that

. If t* = 0, then the fact that

does not increase in a small interval [0,δ] follows from Lemma 12(b′). If t* > 0, then precisely the same construction as in the proof of Step 2(a) shows that

does not increase in a small interval [t′,t′ + δ] containing t* in its interior. Step 3 has been proved.

Proof of Step 4. Follows from the statements of Steps 2 and 3 and Proposition 1 (in the Appendix).

Proof of Step 5. It suffices to show the following:

For any t* ≥ 0 and any ε > 0, there exists δ > 0 such that

(The u.o.c. convergence will then follow from the Heine–Borel lemma.)

, then (101) must hold because both functions

(for large r) are bounded by an arbitrarily small constant in a sufficiently small neighborhood of t*. If

, then (101) follows from Lemma 12(d′). If

, then to obtain (101) we can repeat the construction of the proof of Step 2(a) and then apply Lemma 12(d). Step 5 has been proved.

Thus, the proof of Property 2, and with it the proof of statement (i) of the theorem, is complete. █

10.2. Proof of Theorem 1(ii)

We use the same construction of the probability space Ω, the subsequence

, and the probability 1 subset Ω₂, as specified earlier. Consider an arbitrary discipline G. Sample paths for both the Gcμ and G disciplines are constructed on this common probability space. For ω ∈ Ω₂, consider paths of

, corresponding to the discipline G. Since

is invariant with respect to the discipline,

, and therefore

u.o.c.

We claim that, along the subsequence

, for any t ≥ 0,

and therefore (40) holds. To prove this, we first recall that Lemma 11 holds for any discipline G satisfying condition (d0).

For any subsequence

, we can choose a further subsequence

such that

, where

is some nondecreasing nonnegative RCLL function. (The case that

takes value +∞ starting from some finite time t_* is possible.) Therefore, for any t ≥ 0 where

is continuous, as r → ∞ along

Since

is nonnegative, we see that

is nonnegative at every point of continuity of

, and therefore it is nonnegative for all t ≥ 0 (by right-continuity). Then, by Proposition 1(ii) (in the Appendix),

for all t ≥ 0. Since

is continuous and nondecreasing and

and all

are nondecreasing, for any t ≥ 0 we obtain the uniform bound

By Lemma 11, we have an analogous bound for

as well:

which proves (102), with r → ∞ along subsequence

, and therefore along

as well (since the subsequence

can be arbitrary). The proof of (102) (and therefore (40)) is complete.

Since the function [sum ]_j C_j(u_j) is continuous in the vector u, and the fixed point

in (41) minimizes the value of [sum ]_j C_j(u_j) over vectors u with server workload

, property (41) also holds. Finally, the equality in (42) follows from the fact that

u.o.c., and the inequality follows from (41) and Fatou's lemma.

The proof of Theorem 1 is now complete.

11. PROOF OF THEOREM 2

The proof of Theorem 2 is a relatively straightforward extension of that of Theorem 1. The extension is based on the fact that, given the assumptions of Theorem 2(ii), the processes

are in fact “asymptotically close” (see (103)). As a result, the behavior of the system (in the diffusion limit) under MinDrift(Q) is the “same” as that under MinDrift(U). In this section, we provide a detailed sketch of such a proof extension. We believe the details can be easily filled in by a reader.

Construction of the probability space and subsequences

. For the proof of Theorem 2, we assume that, for each r, the service times of the “initial customers” of type i at server j, whose service has not yet started at initial time 0, are given by an i.i.d. sequence v_ij^r(n), n = 1,2,…. Thus, the sequence {v_ij^r(n)} is separate from the sequence {v_ij^r(n)}, defining service times of customers arriving after time 0, but, of course, v_ij^r(1) has the same distribution as v_ij^r(1). We denote by

the total amount of unfinished work “contained” in the the first n (in the order of them being taken for service) initial type i customers at the server j. As with other functions, we extend the domain of ^ΣV_ij^r(·) to all real nonnegative t ≥ 0 and denote its fluid-scaled version by ^Σv_ij^r = Γ^r ^ΣV_ij^r.

The underlying probability space is the same as in the proof of Theorem 1, except it is augmented by taking a direct product with the space on which the sequences (on r) of the processes {^ΣV_ij^r} are defined. The subsequence

is defined exactly the same way. The property analogous to (78) holds for the processes ^Σv_ij^r, as well as ^Σv_ij^r. Then, the subsequence

can be chosen in a way such that, additionally, the properties analogous to (78) and (80) hold for the processes ^Σv_ij^r and ^ΣV_ij^r, respectively.

“Asymptotic closeness” of

. Using (78) and (80), and their analogs for ^Σv_ij^r and ^ΣV_ij^r, it is easy to demonstrate the following property, which holds for any service discipline within the class specified in Theorem 2(ii):

As r → ∞ along

, with probability 1, for any T₃ > 0 and any ε > 0, we have

This property is the key in showing that, in the heavy traffic limit, MinDrift(Q) induces the same system behavior as the MinDrift(U).

Definition and properties of the FSPs under MinDrift(Q). The process Z^r is augmented by the following components:

The fluid-scaled process z^r and an FSP z are augmented by the corresponding components q^r, ^Σv^r, ^qu^r, ^qx^r, and q, ^Σv, ^qu, ^qx, respectively. The definition of the FSP is the same, except it includes the additional conditions

and an analog of (57) for ^Σv_ij^r. This augmented definition of an FSP easily yields the following additional FSP property (which can be added into Lemma 5):

which, of course, also implies ^qx(t) = ^ux(t), t ≥ 0. Using (104), it can be easily shown that all of the FSP properties established for MinDrift(U) hold for MinDrift(Q) as well.

Proof of Theorem 2. Given property (103) and the fact that FSPs under MinDrift(Q) satisfy all of the properties of FSPs under MinDrift(U) (plus (104)), the rest of the proof is the same as that of Theorem 1.

12. PROOF OF THEOREM 4

Before proceeding with the proof, note that, in addition to (51), we have another pathwise relation (see (21) and (22)):

Proof: The argument leading to Theorem 3 shows that, given the conditions of Theorem 4, either of the convergences (53) or (54) implies (52). So, it will suffice to prove that (52) implies (55).

Assume (52). Suppose, for some t₁ ≥ 0, property (55), with t = t₁, does not hold. This implies that, in addition to the pathwise inequality

(see (51)), we have, for some fixed constant c > 0,

Consider a subsequence of indices r along which the distributions of both

(weakly) converge to some distributions, which we denote η and ^qη, respectively. Distribution η is necessarily equal to the distribution of

, and ^qη (stochastically) dominates η and is not equal to η, which follows from (106).

Consider the processes

restarted at time t₁. Let us fix arbitrary t₂, t₁ < t₂ < ∞. Then we have

where

Since pathwise inequality (105) holds, we see that (107) holds for the process

as well:

Consider now an RBM

with the drift a and diffusion coefficient σ (same as for the RBM

), defined within interval [t₁,t₂], with the initial distribution ^qη at time t₁. Since ^qη strictly dominates η,

Using Theorem 2(ii), it is easy to see that the RBM

is an asymptotic (stochastic) lower bound of the sequence of processes

(in the sense specified in Theorem 2(ii)). From this fact we see that for any δ > 0, we can choose a sufficiently small ε > 0, so that

If we fix δ ∈ (0,p₁ − p₂) and a corresponding ε > 0 as above, we obtain the following from the estimates (110) and (108):

a contradiction, which completes the proof. █

13. STABILITY VERSUS HEAVY TRAFFIC WORKLOAD MINIMIZATION

Consider a special case of the MinDrift routing rule, with cost functions C_j(ζ) = (½)ζ² (see Section 4.4). The corresponding MinDrift(Q) rule is as follows: route an arriving type i customer to a server j such that

Our heavy traffic results apply to this rule. (The form of the rule does not change with r, as explained in Section 8.)

Using the approach of [1,2,6,14,15], it is not hard to show that under this routing rule (plus arbitrary scheduling rules satisfying (d1) and (d2)), both the queue length process ((Q_ij(t), i ∈ I, j ∈ J), t ≥ 0) and the unfinished work process ((U_j(t), j ∈ J), t ≥ 0) are stable, as long as the vector of mean rates λ is within the system stability region

, defined in Section 5. In fact, as explained in [14], the analysis of the FSPs (Section 9), required to establish the heavy traffic results, is essentially a “superset” of the analysis needed to prove stability. (We do not provide details of the stability proof, as it is not the focus of this article.)

Thus, the above rule is able to both keep queues stable (as long as

) and minimize system workload in the heavy traffic limit. The Gcμ scheduling rule for the IQ system possesses the same property (see [12]) and so does the MaxWeight scheduling rule for a different, but closely related, “generalized switch” model (see [14]). All of these results may suggest the intuition that a dynamic service discipline that keeps queues stable (as long as

), “typically” will also minimize system workload in heavy traffic. Such a “conjecture” cannot be formally correct, because it is not hard to devise some contrived disciplines, for which it does not hold. We note, however, that this conjecture does not hold even for very natural service disciplines, as the following example demonstrates.

Consider a service discipline for our OQ system, which strives to minimize the drift of the cost (Lyapunov) function

Then the discipline has the following form. (It is close to the class of network scheduling disciplines introduced in [15].)

Routing rule (“Join the shortest queue of your type”): Route an arriving type i customer to a server j such that

Scheduling rule (“Gcμ within each server”): Server j picks a customer of type i such that

This discipline ensures stability of the queues, when

. (Again, the approach and techniques of [1,2,6,14,15] can be applied.)

However, it is not hard to see that, under this discipline and under the conditions of Theorem 4, condition (55) cannot possibly hold, as long as μ_ij > 0 for at least one nonbasic activity (ij). We do not provide a formal proof here. The key part of a proof is showing the following very intuitive fact, which is implied by the nature of the routing rule: FSPs under this discipline are such that if the initial workload is nonzero, then after some finite time, all nonbasic queue lengths are bounded away from zero. We also exploit the fact that since the limiting workload process is lower bounded by an RBM, at any time t > 0 the limiting workload is nonzero with nonzero probability. Thus, by Theorem 4, none of the workload minimization properties (52)–(54), can hold.

Acknowledgment

The author is very grateful to Avi Mandelbaum for bringing output-queued flexible server systems to his attention and encouraging this work.

APPENDIX: The One-Dimensional Skorohod Problem

The following proposition describes standard properties of solutions to the one-dimensional Skorohod problem. (See, for example, [5] for the proof. The proof is also contained in the proof of Theorem 5.1 of [18]).

Proposition 1: Let w = (w(t), t ≥ 0) be a continuous function in D([0,∞),R) such that w(0) ≥ 0. Then the following hold:

(i)There exists a unique pair (x,y) of functions in D([0,∞),R), such that the following hold:

(a) x(t) = w(t) + y(t) ≥ 0, t ≥ 0.

(b) y is nondecreasing and nonnegative.

(c) y(0) = 0.

(d) For any t ≥ 0, if x(t) > 0, then t is not a point of increase of y; that is, there exists δ > 0 such that y(ξ) is constant in [t − δ,t + δ] ∩ R₊. This unique pair is (x^○,y^○), where

(ii) For any pair (x,y) of functions in D([0,∞),R) satisfying (a) and (b), we have

References

REFERENCES

Andrews, M., Kumaran, K., Ramanan, K., Stolyar, A.L., Vijayakumar, R., & Whiting, P. (2004). Scheduling in a queueing system with asynchronously varying service rates. Probability in the Engineering and Informational Sciences 18: 191–217.Google Scholar

Armony, M. & Bambos, N. (1999). Queueing networks with interacting service resources. In Proceedings of the 37th Annual Allerton Conference, pp. 42–52.

Bell, S.L. & Williams, R.J. (2001). Dynamic scheduling of a system with two parallel servers in heavy traffic with complete resource pooling: Asymptotic optimality of a continuous review threshold policy. Annals of Applied Probability 11: 608–649.Google Scholar

Bramson, M. (1998). State space collapse with applications to heavy traffic limits for multiclass queueing networks. Queueing Systems 30: 89–148.Google Scholar

Chen, H. & Mandelbaum, A. (1991). Leontief Systems, RBV's and RBM's. In M.H.A. Davis & R.J. Elliott (eds.), Applied stochastic analysis. Gordon and Breach Science, pp. 1–43.

Dai, J.G. & Prabhakar, B. (2000). The throughput of data switches with and without speedup. In Proceedings of the INFOCOM'2000, pp. 556–564.

Ethier, S.N. & Kurtz, T.G. (1986). Markov process: Characterization and convergence. New York: John Wiley & Sons.

Harrison, J.M. (1998). Heavy traffic analysis of a system with parallel servers: Asymptotic optimality of discrete review policies. Annals of Applied Probability 8: 822–848.Google Scholar

Harrison, J.M. & Lopez, M.J. (1999). Heavy traffic resource pooling in parallel-server systems. Queueing Systems 33: 339–368.Google Scholar

Kelly, F.P. & Laws, C.N. (1993). Dynamic routing in open queueing networks: Brownian models, cut constraints and resource pooling. Queueing Systems 13: 47–86.Google Scholar

Laws, C.N. (1992). Resource pooling in queueing networks with dynamic routing. Advances in Applied Probability 24: 699–726.Google Scholar

Mandelbaum, A. & Stolyar, A.L. (2004). Scheduling flexible servers with convex delay costs: Heavy-traffic optimality of the generalized cμ-rule. Operations Research 52(6): 836–855.Google Scholar

McKeown, N., Anantharam, V., & Walrand, J. (1996). Achieving 100% throughput in an input-queued switch. In Proceedings of the INFOCOM'96, pp. 296–302.

Stolyar, A.L. (2004). MaxWeight scheduling in a generalized switch: State space collapse and workload minimization in heavy traffic. Annals of Applied Probability 14(1): 1–53.Google Scholar

Tassiulas, L. & Ephremides, A. (1992). Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio network. IEEE Transactions on Automatic Control 37: 1936–1948.Google Scholar

Teh, Y. & Ward, A. (2002). Critical thresholds for dynamic routing in queueing networks. Queueing Systems 42: 297–316.Google Scholar

Van Mieghem, J.A. (1995). Dynamic scheduling with convex delay costs: The generalized cμ rule. Annals of Applied Probability 5: 809–833.Google Scholar

Williams, R.J. (1998). An invariance principle for semimartingale reflecting Brownian motions in an orthant. Queueing Systems 30: 5–25.Google Scholar

Williams, R.J. (1998). Diffusion approximations for open multiclass queueing networks: Sufficient conditions involving state space collapse. Queueing Systems 30: 27–88.Google Scholar

Williams, R.J. (2000). On dynamic scheduling of a parallel server system with complete resource pooling. Fields Institute Communications 28: 49–71.Google Scholar

Article contents

OPTIMAL ROUTING IN OUTPUT-QUEUED FLEXIBLE SERVER SYSTEMS

Abstract

1. INTRODUCTION

1.1. The Problem

1.2. Previous Work

1.3. Our Results

1.4. Outline of This Work

2. BASIC NOTATION AND CONVENTIONS

3. THE MODEL

4. THE MinDrift ROUTING RULE

4.1. Notation

4.2. MinDrift Rule Definition

4.3. Nature of the MinDrift Rule

4.4. Examples

5. COMPLETE RESOURCE POOLING CONDITION

6. EQUIVALENT CHARACTERIZATION OF THE CRP CONDITION IN TERMS OF THE OQ MODEL

7. HEAVY TRAFFIC REGIME

8. MAIN RESULTS

8.1. Optimality of the MinDrift(U) Rule

8.2. Optimality of the MinDrift(Q) Rule

8.3. Customer Workload Minimization Under the MinDrift(Q) Rule

8.4. A Necessary Condition for Workload Minimization Under Any Service Discipline: Vanishing Nonbasic Queues

9. FLUID SAMPLE PATHS UNDER THE MinDrift RULE

9.1. Fluid Sample Paths: Definition and Basic Properties

9.2. Uniform Attraction of Fluid Sample Paths

10. PROOF OF THEOREM 1

10.1. Proof of Theorem 1(i)

10.2. Proof of Theorem 1(ii)

11. PROOF OF THEOREM 2

12. PROOF OF THEOREM 4

13. STABILITY VERSUS HEAVY TRAFFIC WORKLOAD MINIMIZATION

Acknowledgment

APPENDIX: The One-Dimensional Skorohod Problem

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests