Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-02-05T20:09:36.542Z Has data issue: false hasContentIssue false

OPTIMAL ROUTING IN OUTPUT-QUEUED FLEXIBLE SERVER SYSTEMS

Published online by Cambridge University Press:  23 March 2005

Alexander L. Stolyar
Affiliation:
Bell Labs, Lucent Technologies, Murray Hill, New Jersey 07974, stolyar@research.bell-labs.com
Rights & Permissions [Opens in a new window]

Abstract

We consider a queuing system with multitype customers and nonhomogeneous flexible servers, in the heavy traffic asymptotic regime and under a complete resource pooling (CRP) condition. For the input-queued (IQ) version of such a system (with customers being queued at the system “entrance,” one queue per each type), it was shown in the work of Mandelbaum and Stolyar that a simple parsimonious Gcμ scheduling rule is optimal in that it asymptotically minimizes the system customer workload and some strictly convex queuing costs. In this article, we consider a different—output-queued (OQ)—version of the model, where each arriving customer must be assigned to one of the servers immediately upon arrival. (This constraint can be interpreted as immediate routing of each customer to one of the “output queues,” one queue per each server.) Consequently, the space of controls allowed for an OQ system is a subset of that for the corresponding IQ system.

We introduce the MinDrift routing rule for OQ systems (which is as simple and parsimonious as Gcμ) and show that this rule, in conjunction with arbitrary work-conserving disciplines at the servers, has asymptotic optimality properties analogous to those Gcμ rule has for IQ systems. A key element of the analysis is the notion of system server workload, which, in particular, majorizes customer workload. We show that (1) the MinDrift rule asymptotically minimizes server workload process among all OQ-system disciplines and (2) this minimal process matches the minimal possible customer workload process in the corresponding IQ system. As a corollary, MinDrift asymptotically minimizes customer workload among all disciplines in either the OQ or IQ system.

Type
Research Article
Copyright
© 2005 Cambridge University Press

1. INTRODUCTION

1.1. The Problem

We consider a queuing system with multiple customer flows (types) i = 1,…,I and nonhomogeneous flexible servers j = 1,…,J. This means that the mean service time μij−1 of a type i customer by server j depends on both the customer type and the server. We study the “heavy traffic” asymptotic regime, when the system is close to be “critically loaded,” and assume that a certain complete resource pooling (CRP) condition holds. Associated with the CRP condition is the notion of system workload, which in this article is called system customer workload.

The “input-queued” (IQ) version of the model (see [3,8,9,12,20]) is such that arriving customers are placed in “input” queues, one queue per each type i, where they await for service without being preassigned to any particular server until they are actually “taken for service” by one of them. It is shown in [12] that an IQ system can be asymptotically optimally controlled in heavy traffic (and under the CRP condition) by a very simple and parsimonious generalized cμ (Gcμ) scheduling rule, which, in particular, minimizes customer workload among virtually all service disciplines.

In this article, we consider a different—“output-queued” (OQ)—version of the model, where each arriving customer must be assigned (or routed) to one of the servers immediately upon arrival. (This can be viewed as immediate routing of each arriving customer to one of the “output” queues, one per each server.) Such models arise in various applications, including wireless networks, manufacturing systems, and call centers. A wireless application example is a system in which data packets (“customers”) need to be delivered to multiple mobile users/destinations (which determine “customer types”) via a set of transmitters (“servers”); transmission (“service”) rates depend on the (different) channel qualities between different transmitters and users.

Due to the above immediate routing (IR) constraint, the space of controls allowed for an OQ system is a (strict) subset of that for the corresponding IQ system. (Gcμ is not a valid discipline for OQ systems.) A natural question is: “Is it possible to control an OQ system in heavy traffic as efficiently as the corresponding IQ system could be controlled? For example, are there OQ-system controls that are as parsimonious as Gcμ but still able to minimize the system customer workload?” The results of this article demonstrate that the answer to the latter question is “yes”—we introduce the MinDrift routing rule, which, in particular, minimizes the system customer workload in heavy traffic. We will describe our results shortly, after a brief literature review.

1.2. Previous Work

The IQ version of our system (with nonhomogeneous flexible servers) and the definition of the heavy traffic asymptotic regime for it (in terms of a certain linear program) were introduced in [8] (for a special two-server system) and in [9,20] (for the general setting). For the system in heavy traffic, these articles also define the CRP condition and associated with it the notion of customer workload, defined as X(t) = [sum ]i νi*Qi(t), t ≥ 0, where Qi(t) is the type i queue length at time t and νi* > 0 is the fixed constant called workload contribution of a type i customer. References [8,9] propose discrete-review scheduling policies, which, in the heavy traffic and under CRP condition, asymptotically minimize customer workload and linear holding costs. In [3,20], continuous-review threshold policies are proposed that are asymptotically optimal, in the same sense and also under CRP condition. (The asymptotic optimality proofs are given for a special two-server system.) The common feature of discrete-review and continuous-review threshold policies is that they require a priori knowledge of the flows' mean arrival rates λi.

In [12], it is proved that a very simple Gcμ scheduling rule (which, in particular, does not require the knowledge of arrival rates λi) asymptotically minimizes customer workload and strictly convex holding costs in a general IQ system under the CRP condition. Moreover, in the limit, the (appropriately rescaled) queue-length vector process (Q1(t),…,QI(t)) exhibits state space collapse (SSC): it “lives” on a one-dimensional manifold. (The results of [12] are closely related to earlier results in [14] for a discrete-time generalized switch model. Also, they generalize the earlier Gcμ optimality results for a single-server system [17].) Following [14], [12] provides equivalent (geometric) characterization of the CRP condition, as follows. Let

be the system service rate region, which is roughly the set of all vectors representing feasible long-term average service rates the system is capable of jointly providing to different types. Then the vector ν* = (ν1*,…,νI*) of workload contributions is the unique (up to scaling) outer normal vector to the boundary of

at the point λ = (λ1,…,λI).

Most of the previous work on OQ systems is concentrated on load balancing schemes for systems with homogeneous servers. Much less work has been done on a heavy traffic regime in systems with nonhomogeneous flexible servers. Probably the first was [10], where a two-server system is considered, resource pooling in heavy traffic is discussed, and threshold-based policies are proposed. (See also [11] for an earlier discussion of resource pooling in systems with routing.) In a recent article [16] a two-server system (different from that in [10]) with exponential service times is considered, and the asymptotic optimality of a threshold routing policy is proved, under linear holding costs. We refer the reader to [16] for a more extensive review of the previous work on OQ systems.

1.3. Our Results

In this article, we consider a general OQ system in a heavy traffic regime and under the CRP condition. First, we give further equivalent characterization of the CRP condition, which is natural and convenient for the analysis of OQ systems; namely, we consider the server utilization region

, which is the set of potential server utilization vectors that can be imposed by the input flows with mean rates λi or greater. Then the CRP condition, in particular, implies the uniquenes (up to scaling) of the vector normal to

at the boundary point (1,…,1). We call components αj* > 0 of the vector α* = (α1*,…,αJ*), opposite of the above-mentioned normal vector, server workload contributions of different servers, and we call by system server workload the quantity uX(t) = [sum ]j αj*Uj(t), where Uj(t) is the unfinished work of server j at time t. We establish the relations between customer and server workload contributions, which show that the asymptotic relation X(t) ≤ uX(t) between customer workload and server workload exists.

We assume that a strictly convex increasing function Cj(·) is defined for each server j, which is interpreted as the cost rate incurred by the unfinished work on server j.

We introduce two versions of the MinDrift routing rule. MinDrift(U) assigns an arriving type i customer to a server

(This version may not be practical in many cases, because it assumes exact knowledge of the unfinished work values Uj(t).) The MinDrift(Q) rule is the same as MinDrift(U), except Uj(t) in (1) is replaced by the Q-estimated unfinished work qUj(t) = [sum ]m μmj−1Qmj(t) of server j, where Qij(t) denotes the number of type i customers in the server j queue. (MinDrift(Q) is a more practical version.)

Our main result (see Theorems 1 and 2) is that, in the OQ system in the heavy traffic asymptotic limit and under the CRP condition, the MinDrift rule (either version), in conjunction with virtually arbitrary work-conserving scheduling disciplines at the servers, minimizes (among all service disciplines) the server workload and the instantaneous and cumulative costs corresponding to the cost rate [sum ]j Cj(Uj(t)). Moreover, in the limit, the (rescaled) unfinished work vector process (U1(t),…,UJ(t)) exhibits SSC such that the vector (C1′(U1(t)),…,CJ′(UJ(t))) is always proportional to α*. (This behavior is analogous to that exhibited by IQ systems in [12,14], but with the server unfinished works replacing “input”-queue lengths and server workload contributions replacing customer workload contributions.) In addition, the minimal server workload process (attained under MinDrift) matches the minimal possible customer workload process in the corresponding IQ system. As a corollary, MinDrift(Q) (asymptotically) minimizes customer workload among all disciplines in either the OQ or IQ system (see Theorem 3). In this sense, the MinDrift rule controls an OQ system as efficiently as the corresponding IQ system (allowing a wider class of disciplines) can possibly be controlled.

Essentially as another corollary of the main results, we obtain a necessary condition (Theorem 4) for any OQ-system service discipline to have a (asymptotic) workload minimization property. Using this condition, we demonstrate (in Section 13) that even some very natural service disciplines in OQ systems, known to guarantee stability of the queues (if such is feasible at all), do not minimize system workload in heavy traffic.

Another contribution of the article is that, in addition to the CRP condition, we in fact identify and characterize a weaker First-Order CRP (FO-CRP) condition. The purpose of doing this is twofold. First, FO-CRP is sufficient to establish convergence properties of fluid sample paths, which arise in the fluid limit asymptotic regime and (in addition to being an important step in proving the main heavy traffic results) are of independent interest. Second, this clarifies the role of the additional assumption, which strengthens FO-CRP to CRP, in proving the heavy traffic results.

Finally, we would like to point out that despite the fact that the technical development in this article is in many ways analogous to that in [12,14], some parts of it are quite different. In particular, the representation of the server workload process in Sections 9 and 10 (roughly, as a sum of the “driving” and “regulation” processes) is substantially different from the representation of customer workload processes in [12,14].

1.4. Outline of This Work

In Section 2, we set basic notations and conventions. The OQ-system model is formally introduced in Section 3. In Section 4, we define the (two versions of) MinDrift routing rule and discuss its basic intuition and some examples. The definition and characterization of FO-CRP and CRP conditions in terms of an IQ system are presented in Section 5. Section 6 gives an equivalent characterization of FO-CRP and CRP conditions in terms of an OQ system and establishes relations between customer and server workload contributions. The heavy traffic asymptotic regime is defined in Section 7, which also contains the definitions of and the relations between customer workload and server workload. Section 8 contains formulations of our main results (Theorems 1–4), described earlier. The analysis of fluid sample paths is the subject of Section 9. Section 10 contains the proof of Theorem 1, regarding the asymptotic optimality of the MinDrift(U) rule. The proof of Theorem 2 (regarding the MinDrift(Q) rule) can essentially be reduced to that of Theorem 1—a detailed outline of this reduction is given in Section 11. Theorem 4, a necessary condition for asymptotic workload minimization, is proved in Section 12. We conclude in Section 13 with a discussion of the relation between stability and heavy traffic optimality properties of service disciplines.

2. BASIC NOTATION AND CONVENTIONS

We use the standard notations R and R+ for the sets of real and real nonnegative numbers, respectively; the not quite standard R++ is used for the set of strictly positive real numbers. Corresponding N-times product spaces are denoted RN, R+N, and R++N. The space RN is viewed as a standard vector-space, with elements xRN being row-vectors x = (x1,…,xN). We write simply 0 for the zero vector in RN and 1 [esdot ] (1,1,…,1) for a vector with all unit coordinates. (The dimensions of vectors 0 and 1 are either specified explicitly or are clear from the context.)

The scalar product (dot product) of x,yRN is

and the norm of x is

Vector inequalities are to be understood componentwise. As an example, for γ,xRN, γ < x means γi < xi, i = 1,…,N. Also,

and if γ ∈ R++N, we slightly abuse notation by writing

We denote the minimum and maximum of two real numbers ξ1 and ξ2 by ξ1 ∧ ξ2 and ξ1 ∨ ξ2, respectively.

Let D([0,∞),R) be the standard Skorohod space of right-continuous left-limit (RCLL) functions, defined on [0,∞) and taking real values. (See, for example, [7] for the definition of this space and its associated topology and σ-algebra.)

The symbol

denotes convergence in distribution of random processes (or other random elements) (i.e., weak convergence of their distributions). Typically, we consider convergence of processes in D([0,∞),R), or its N-times product space DN([0,∞),R) equipped with product topology and σ-algebra.

The symbol

(or the abbreviation u.o.c. after a convergence statement) stands for convergence that is uniform on compact sets, for elements of D([0,∞),R) or its N-times product DN([0,∞),R). For functions with a bounded domain AR, the u.o.c. convergence means uniform convergence.

We reserve the symbol ⇒ for weak convergence of elements in the space D([0,∞),R); the latter is the space of RCLL functions taking values in the set R of real numbers, extended to include the two “infinite numbers” +∞ and −∞ (with the natural topology on R). If h,gD([0,∞),R), then hg means h(t) → g(t) for every t > 0 where g is continuous. (Convergence at t = 0 is not required.) We will not need any characterization of the topology on D([0,∞),R) beyond the definition of convergence given earlier.

3. THE MODEL

We consider a queuing system with a finite number I of customer types and a finite number J of flexible servers. For notational convenience, we use the symbol I also for the set of types {1,…,I}. Similarly, J also denotes the set of servers {1,…,J}.

The arrival process for each type iI is a renewal process with the time (from the initial time 0) until the first arrival being ui(0), and the rest of the interarrival times being an independent and identically distributed (i.i.d.) sequence ui(n),n = 1,2,…. Let λi = 1/E [ui(1)] > 0 denote the arrival rate for type i and αi2 = Var[ui(1)]. The service times of type i customers by server jJ form an i.i.d. sequence vij(n),n = 1,2,…. Let μij = 1/E [vij(1)] < ∞ and βi2 = Var[vij(1)]. The convention μij = 0 is used when server j cannot serve type i. All arrival and service processes are assumed mutually independent.

A version of such a flexible (parallel) server model, which received most attention in the previous work (see [3,8,9,12,20] and references therein) is the input-queued model. In the input-queued model, customers of each type i that await service are waiting in a separate “input” queue i of infinite capacity. This, in particular, means that customers do not have to be assigned to the servers while waiting in the input queue; such server assignment is (irreversibly) done when the customer is “pulled” for service by one of the servers.

In this article, we concentrate on a different—output-queued (OQ)—model, satisfying the following (additional) immediate routing (IR) condition:

Each new customer arriving in the system must be assigned to one of the servers j immediately upon arrival, and after that, the customer can only be served by the server to which it is assigned.

A natural way to interpret the IR condition (and this interpretation is in fact the main motivation for the OQ model) is that, upon arrival, each new customer must be routed to one of the servers or, more precisely, into the “output” queue associated with (or “located at”) that server.

Remark 1: It should be clear that the IR condition defines the only difference between an OQ system and the corresponding IQ system. Therefore, in general, the class of controls (or service disciplines) for an OQ system is a strict subset of that for the corresponding IQ system. For example, the Gcμ discipline for IQ systems, studied in [12], does not satisfy the IR condition and, consequently, is not a valid discipline for OQ systems.

A service discipline in an output-queued system consists of two parts: routing (server assignment) algorithm and scheduling rule employed by each server (and, generally speaking, depending on the server) to determine which customer to serve from its queue (i.e., among the customers assigned to it).

We will consider the class of service disciplines satisfying (in addition to IR) the following condition on the routing algorithm:

(d0) The realizations of a customer's service requirements are not known at the time when routing decision (server assignment) for this customer is made. (The distributions of the service requirements at different servers are known.)

Sometimes, but not always, we will further restrict the class of service disciplines by imposing the following conditions on the server scheduling rules:

(d1) Scheduling rule of each server is nonpreemptive within each customer type; namely, a server cannot take for service a new customer of type i if it already has another type i customer “in service” (with both elapsed and residual service times being nonzero). Consequently, at any given time, a server cannot have in service more than one customer of any given type.

(d2) A server does not “know” the realization of a customer service time before the customer service starts.

Note that conditions (d1) and (d2) do allow a server idling (even if it has customers in service) or preemption of service of one customer by another customer but of a different type. They also allow server-sharing by several customers but, again, each of a different type.

Remark 2: Note that the class of IQ-system service disciplines, satisfying conditions (d0)–(d2) (but not IR), is well defined, and it, first, belongs to the class of disciplines studied in [12] and, second, contains the Gcμ discipline (see [12]). Moreover, this class is obviously a superset of the above-defined class of OQ-system disciplines satisfying (d0)–(d2) and IR.

4. THE MinDrift ROUTING RULE

4.1. Notation

Let Uj(t) denote the (unfinished) work of server j at time t; namely the total amount of unfinished processing time of all customers of all types present in server j queue at time t. We denote by Qij(t) the number of type i customers in queue j at time t, including those customers whose service is already started but not yet completed. The quantity

we will call Q-estimated (unfinished) work of server j. Finally, by Qi(t) we will denote the total number of type i customers in the system at time t. In the OQ system, we always have

but we note that Qi(t) is well defined for both the OQ and IQ systems.

4.2. MinDrift Rule Definition

Suppose that for each server j, a cost function Cj(ζ), ζ ≥ 0, is given. Assume that the cost functions have the following properties:

Cj(·) is continuous strictly increasing convex, with Cj(0) = 0.

Moreover, the first derivative Cj′(·) is continuous strictly increasing, with Cj′(0) = 0.

Finally, the second derivative Cj′′(·) is strictly positive continuous in the open interval (0,∞), with Cj′′(0) = limζ↓0 Cj′′(ζ) ≥ 0, where Cj′′(0) is either finite or is +∞.

The MinDrift rule routes (assigns) customers to the servers as follows. When a new customer of type i arrives in the system, it is routed to a server j such that

Ties are broken arbitrarily; for example, in favor of the smallest index j. Also, by convention, a type i customer can never be routed to a server j if μij = 0. (Throughout this article, we also adopt a related convention that any expression involving division by μij holds under the additional assumption that μij > 0, even if we do not state this explicitly.)

Defined by (2) is the basic version of the MinDrift rule; we will refer to it as a MinDrift(U) rule.

A version of MinDrift rule, with Uj in (2) replaced by qUj, will be called MinDrift(Q) rule; namely the MinDrift(Q) rule routes an arriving type i customer to a server j such that

4.3. Nature of the MinDrift Rule

The nature of the MinDrift rule is simple—it “myopically” (or “greedily”) tries to minimize the average drift of the aggregate cost function [sum ]j Cj(Uj(t)). Indeed, Cj′(Uj(t))/μij (see (2)) approximates the expected increment of the aggregate cost function, caused by routing one type i customer (arrived at time t) to server j; therefore, by (2), MinDrift(U) routes new arrivals in the way such that the (approximate) expected increment of [sum ]j Cj(Uj(t)) is minimized. In other words, MinDrift(U) routing tries to minimize the average rate of increase of [sum ]j Cj(Uj(t)), due to placement of new work (or load) to the servers. Note that in the OQ system, the “best” a service discipline can do to maximize the rate at which [sum ]j Cj(Uj(t)) is decremented due to processing of the unfinished work, is to never idle servers when they have work to do. Thus, the MinDrift(U) routing rule (in conjunction with arbitrary work-conserving scheduling rules at the servers) strives to minimize the average drift of the aggregate cost.

The MinDrift(Q) rule is based on the same principle as MinDrift(U), except that instead of using the exact values Uj of unfinished work (which may not be available in many applications), it uses their (estimated) average values qUj (which may be more readily available). As we will demonstrate, in the heavy traffic asymptotic regime we consider, the two versions MinDrift(U) and MinDrift(Q) of the rule are in a certain sense “indistinguishable,” under nonrestrictive additional conditions.

The Gcμ scheduling rule for IQ systems, studied in [12], is the rule that myopically tries to minimize the drift of the aggregate cost function [sum ]i Ci(Qi(t)) of the queue lengths Qi, with Ci(·) being cost functions having the same properties as functions Cj(·) defined earlier. Consequently, Gcμ is such that a server j always tries to serve a queue

thus maximizing the average rate at which [sum ]i Ci(Qi(t)) is decreased due to departures of served customers. The Gcμ does not—and cannot—exercise any control over the rate of increase of [sum ]i Ci(Qi(t)) due to new arrivals. Therefore, although both Gcμ (in an IQ system) and MinDrift (in an OQ system) strive to minimize drifts of certain cost functions, they differ in that they control different system state variables: Gcμ controls the rates at which queue lengths Qi are depleted due to service, and MinDrift controls the rates at which unfinished works Uj are increased due to new arrivals.

4.4. Examples

Consider a special case when the cost functions have the form Cj(ζ) = γjζη+1, with some fixed η > 0 and γj > 0. Then the MinDrift(U) becomes the rule routing an arriving type i customer to a server

and similarly for MinDrift(Q).

Consider a special case of the system such that, for any given pair (ij), we have either μij = 0 or μij = μj > 0. In other words, the system is such that each server j has a (depending on j) subset of types i that it can serve, but the average service rates of all types within this subset are the same and equal to μj (and the server cannot serve at all any types i outside the subset). For this system, the MinDrift(Q) version of (4), with η = 1, becomes

Since parameters γj > 0 can be set arbitrarily, we see from (5) that, for this special system, such “popular” routing rules as “Join a server j with the shortest queue,”

and “Join a server j with the smallest expected unfinished work,”

are special cases of MinDrift(Q). (We remind the reader that, in both cases, routing customers to servers where they cannot be served at all is prohibited. Also, note that if we further assume that each server employs first-in-first-out (FIFO) scheduling, then rule (7) is equivalent to the “Join the shortest expected delay” routing rule.)

It should be clear that in a general system, where service rates μij depend on both i and j more generally than in the special system described earlier, the Join-shortest-queue and Join-smallest-expected-unfinished-work routing rules are not special cases of MinDrift. Consequently, the heavy traffic optimality properties (which we prove in this article for MinDrift) may not (and typically do not) hold for these rules.

5. COMPLETE RESOURCE POOLING CONDITION

In this section, we present the definition of the complete resource pooling (CRP) condition and related notions and results, which are “oriented toward” the analysis of the IQ model and basically follow those in [12]. However, the development in this section is more general than that in Section 5 of [12]. In particular, we consider the notion of First-Order-CRP (FO-CRP) (which is a weaker form of CRP) and prove some additional properties related to this notion. (The results of this section provide a “reference point” for the next section, which gives an equivalent characterization of FO-CRP and CRP conditions “in terms of” the OQ model.)

Consider an I × J matrix φ = {φij, iI, jJ}, with all φij ≥ 0. Each element φij can be interpreted as the average rate at which server j time is allocated to the service of type i customers, in the long run. We do not call elements φij “fractions” of server j time, because, for the reasons which will become clear in Section 6, it will be convenient for us not to assume a priori that [sum ]i φij ≤ 1, or even that φij ≤ 1.

With a given φ we associate the vector μ(φ) = (μ1(φ),…,μI(φ)), whose coordinates are

this is the vector of mean long-run service rates provided to the types iI, if each server j allocates its time to serving type i at the average rate φij.

Consider also a different vector-function of a matrix φ; namely, let the vector ρ(φ) = (ρ1(φ),…,ρJ(φ)) ∈ RJ be defined as

Each component ρj(φ) is naturally interpreted as the total “utilization” of server j, given the average rates at which its time is allocated to service of different types i are given by φij. (Again, we do not assume a priori that ρj(φ) ≤ 1.)

Definition 1: We define

to be the set of μ(φ) corresponding to all possible φ, satisfying the condition

Further, let

denote the set of all maximal elements μ ∈

such that μ ∈ R++I.

is maximal if

implies ζ = μ.)

Note that

is a bounded convex polyhedron in R+I. We assume that

is nondegenerate (i.e., has dimension I), which is equivalent to assuming that each customer type i can be served at nonzero rate μij by at least one server j. The set

is in fact the closure of our system's stability region

, which is the set of arrival rate vectors λ = (λ1,…,λI) such that λ < μ(φ) for some φ satisfying (10) (cf. [1,2,6,13,14,15]).

Definition 2: We say that the condition of FO-CRP holds for a vector λ if λ lies within the interior of one of the ((I − 1)-dimensional) outer faces of

and λ ∈

. If, in addition, the matrix φ such that λ = μ(φ) and (10) hold is unique, then we say that the CRP condition holds.

Remark: The CRP condition given above is the same as in [12], and it is equivalent to that introduced earlier in [9,20] (see Assumption 3.4, Thm. 5.3 and Cor. 5.4 in [20] for a summary). The fact of equivalence will, in particular, follow from the results of this section.

When the FO-CRP condition holds, let us denote by ν = (ν1,…,νI) the (unique up to a scaling) “outer” normal vector to the polyhedron

at the point λ. Note that ν ∈ R++I. (Otherwise, if some νi ≤ 0, a small increase of the component λi would produce a vector λ′ ≥ λ, λ′ ≠ λ, and such that

—a contradiction to the maximality of λ.) For concreteness, we use the normal vector ν*, which is the vector defined uniquely by the additional requirement that ∥ν*∥ = 1. The components νi* of the vector ν* are sometimes called the workload contributions of customers of the different types i (see [9,12,20]); in this article, we will call them customer workload contributions, to make a distinction from the server workload contributions introduced in Section 6.

The FO-CRP condition for λ implies, in particular, that

this in turn implies that, for any matrix φ such that (10) holds and λ = μ(φ) (in fact, the equality in (10) must hold); namely, we have

Given λ satisfying the FO-CRP condition, for each jJ let us denote

Any pair (ij) such that iIj is called basic activity; therefore, Ij indicates the set of basic activities for server j. It is easy to see from (11) that, for any φ satisfying (12), φij > 0 implies iIj.

Lemma 1: If λ satisfies the FO-CRP condition, then the corresponding graph

with nodes being type i and servers type j and arcs being basic activities is connected.

Proof: Suppose not. Consider any breakdown of the graph

into two components,

, disconnected from each other. For m = 1,2, denote by I(m) and J(m) the set of types and servers, respectively, within the component

. By our construction, for any m = 1,2, jJ(m) implies IjI(m).

Consider any φ satisfying (12). Recall that φij > 0 implies iIj. Let us fix a small δ > 0, and consider vector ν** obtained from ν* by the following modification: νi** = νi*(1 + δ) if iI(1), and νi** = νi* if iI(2). Since there is no arc connecting

, if δ is small enough, then φ solves the problem

as well as the problem in the right-hand side (RHS) of (11). In other words, λ = μ(φ) solves the problem

, as well as

. This means that ν** is a normal (different from ν*) to the boundary of

at point λ—a contradiction to the FO-CRP condition. █

Now, with any matrix φ let us associate graph

with nodes being types i and servers j, and arcs (ij) corresponding to pairs (ij) with φij > 0.

Lemma 2:

Proof:

(i) Consider arbitrary φ′ satisfying ρ(φ′) = 1 and such that φij′ > 0 if and only if iIj. Note that ν*·μ(φ′) = ν*·λ, because the condition on φ′ guarantees that φ′ solves the problem in the RHS of (11). Let

where 0 < δ < 1 is fixed. We have

and we can always choose δ to be small enough so that λ′′ lies in the interior of the same face of

as λ does. Then there exists φ′′ such that λ′′ = μ(φ′′), ρ(φ′′) = 1, and φij′′ > 0 implies iIj. It is easy to verify directly that φ = (1 − δ)φ′′ + δφ′ is a matrix with the properties we seek.

(ii) Necessity follows from (i). Let us prove sufficiency. Let φ be a matrix such that (12) holds,

, and the graph

is connected. Since λ ∈

, there exists an outer normal vector ν* to the boundary of

at point λ, and it is such that ν* ≠ 0, ν* ∈ R+I. Consequently, φ solves the problem (in the RHS of) (11). From this, we conclude that ν* ∈ R++I; otherwise (if νi* = 0 for some i), φ could not be a solution to (11), since

is connected. Finally, the normal ν* must be unique (up to scaling). Indeed, consider any other vector ν** ∈ R++I, which is not a scaled version of ν*; that is, maxi νi**/νi* > mini νi**/νi*. Then connectedness of

easily implies that since φ solves (11), it cannot possibly solve (11) with ν* replaced by ν**. Therefore, ν** cannot be a normal to

at point λ.

(iii) The definition of CRP and statements (i) and (ii) of the lemma immediately apply the uniqueness of φ satisfying (12)—the fact that

and that this graph is connected. It remains to show that

must be a tree. Suppose not. Let us pick any cycle on this graph. It is easy to see that we can “perturb” the (strictly positive) elements φij along the arcs of the cycle so as to produce a matrix φ′ ≠ φ such that μ(φ′) = μ(φ) = λ and ρ(φ′) = 1, a contradiction to the uniqueness of φ. █

Remark: Just as in the case of the Gcμ scheduling rule, studied for an IQ model in [12], we emphasize here that the notion of a basic activity is not utilized in any way (neither explicit nor implicit) by the MinDrift routing algorithm. (The algorithm need not know which activities are basic.) It is only used as a tool for the analysis of the algorithm. Similarly, the algorithm need not know the values of workload contributions.

6. EQUIVALENT CHARACTERIZATION OF THE CRP CONDITION IN TERMS OF THE OQ MODEL

In this section, we give an equivalent (“dual”) characterization of the CRP condition and introduce notions and results that will be used in the analysis of our OQ-model.

First, let us give a somewhat different (although closely related) interpretation of the matrix φ and functions μ(φ) and ρ(φ), defined in Section 5. Suppose a matrix φ is given, and assume that customers of a type i arrive (routed) to a server j at the average rate μijφij. Then μi(φ), iI, is the total average rate at which type i customers arrive in the system, and

is the average rate at which the work (i.e., the required amount of processing time) arrives (routed) to server j. (We used the convention that (μijφijij−1 = 0 if μij = 0.)

Definition 3: We define the server utilization region

to be the set of all possible values of ρ(φ) with φ satisfying the condition

Further, let

denote the set of all minimal elements

such that ρ ∈ R++J. (

is minimal if ρ ≥ ζ ∈ K implies ζ = ρ.)

Region

is a convex polyhedron in R+J, and it is nondegenerate (i.e., has dimension J) as long as

is nondegenerate. Note that

is unbounded, but it is, of course, “bounded below,” say by 0, since it lies in the positive orthant.

Lemma 3:

(i) The FO-CRP condition for a fixed vector λ holds if and only if the following is true:

(a) Vector 1RJ lies within the interior of one of the ((J − 1)-dimensional) faces of

.

(b)

.

(ii) When the FO-CRP condition for λ (or, equivalently, (i)(a) and (i)(b)) holds, then the unique (up to a scaling by positive constant) outer normal vector −α* to the polyhedron

at the point 1 is such that α* = (α1*,…,αJ*) ∈ R++J, and α* is related to the vector ν* as follows:

In addition:

(c) iIj (i.e., activity (ij) is basic) if and only if jJi, where

(d) Any matrix φ satisfying ρ(φ) = 1 and (13), in fact, satisfies (12).

(e) We have

(iii) The CRP condition for a fixed vector λ holds if and only if (i)(a), (i)(b), and the following property hold:

(f) the matrix φ satisfying ρ(φ) = 1 and (13) is unique.

When CRP does hold, the matrix φ is in fact the unique solution of (12).

Proof:

(i) Let us prove the necessity of (a) and (b). Consider the vector α* defined by (14). (Note that α* ∈ R++J.) Then (15) holds. Indeed, for a fixed i, we have αj* = μijνi* if (ij) is basic, and αj* > μijνi* otherwise. (Incidentally, this means that iIj is equivalent to jJi.)

Let us choose any matrix φ that solves (12) and such that

. (Recall that graph

is connected.) Then since this φ is such that φij > 0 implies jJi, it is easy to observe that φ solves the problem

or, equivalently, 1 solves the problem

(Compare problems (18) and (19) to problem (11).) This means, in particular, that 1 is a minimal element of

, and α* is a normal to the region

at point 1. Since

is connected, it is easy to see from (18) that ρ = 1 could not possibly solve the problem (19) with α* replaced by any other nonzero vector α** ∈ R+J, unless α** = cα* for some c > 0. This completes the proof of necessity of (a) and (b).

The sufficiency of (a) and (b) follows simply by the symmetry between the definitions of the FO-CRP condition (for λ) and conditions (a) and (b) (expressed in terms of vector 1); namely if for the vector −1 and the region

we define a condition analogous to FO-CRP for λ and region

, this will be exactly conditions (a) and (b). Thus, from this condition ((a) and (b)) we can obtain a condition analogous to (a) and (b), but with

replaced by

, respectively; this latter condition is exactly our original FO-CRP.

(ii) As part of the proof of (i), we already proved all the properties stated in (ii), except (d) and (e). If (d) would not hold, then λ could not be a maximal element of

. Property (e) follows from the fact that any matrix φ, chosen as in the proof of (i), satisfies (12) and solves both problems (11) and (18).

(iii) This follows from (i), (ii), and the definition of CRP. █

When the FO-CRP condition holds, the components αj* of the vector α* we will call server workload contributions of different servers j.

7. HEAVY TRAFFIC REGIME

In this section, we introduce the notion of a sequence of queuing systems in heavy traffic. Suppose a vector λ satisfying the CRP condition is fixed. Associated with this λ are the unique matrix φ satisfying (12) and the corresponding normal vectors ν* ∈ RI and α* ∈ RJ. (We remark that all of the definitions and facts in this section are valid when the weaker FO-CRP condition holds, with arbitrary fixed φ satisfying (12). However, for our main results in Section 8, the stronger CRP condition is essential.)

The quantity

we will call the customer workload of the system. The customer workload process X(·) is of primary interest in the analysis of the IQ model [3,8,9,12,20].

For the OQ model, we define a different (although closely related, in the sense specified later) notion of server workload:

In addition, we define the Q-estimated server workload as

Informally speaking, for a service discipline satisfying constraints (d0)–(d2), qX(t) is a “good” (asymptotically exact) estimate of the server workload uX(t).

Since for any pair of iI and jJ the inequality αj* /μij ≥ νi* holds, we observe that the Q-estimated server workload cannot be less than customer workload:

We also have the following inequality, which we record for future reference:

with

We now consider a sequence of queuing systems, indexed by

, where rn > 0 for all n and rn ↑ ∞ as n → ∞. (Hereafter in this article, r → ∞ means that r goes to infinity along the sequence

or some subsequence of

; the choice of the subsequence will be either explicit or clear from the context.) Each system

has, as earlier, I customer types and J servers. The primitives and the processes corresponding to a system

will be appended with a superscript r.

Assume that for each type i, the mean arrival rate λir = 1/E [uir(1)] is such that

where biR is a fixed constant. Assume also convergence of the variance; that is,

In addition, we make the following technical assumption, needed, in particular, to apply Bramson's weak law estimates [4] (and establish (75) later): Uniformly over i and r,

where η(·) is a fixed function and η(x) → 0 as x → ∞.

For the initial interarrival times, we assume that for each i,

Let Fir(t), t ≥ 0, denote the number of type i customers that arrived in the system by time t, excluding “initial” customers present at time 0. Assumptions (23), (24), and (25) imply a functional central limit theorem (FCLT) for these arrival processes:

where σi2 = λi3αi2, B(·) is a standard (zero drift, unit variance) Brownian motion, and

denotes convergence in distribution (for processes in the standard Skorohod space of RCLL functions).

The service time distributions do not change with the parameter r. (This in particular means that the condition analogous to (25) trivially holds for the service times vi,jr(1), uniformly on (i,j) and r.) Let us denote by

the total amount of work (i.e., total service time) brought to server j by the first l (newly arriving) type i customers routed to it. We extend the domain of ΣVijr(·) to all real nonnegative t ≥ 0 by adopting the convention that ΣVijr(t) = ΣVijr([lfloor ]t[rfloor ]). (We will use the same domain extension convention throughout the article for other functions, which are originally defined on the integers, as well.) For ΣVijr, we have the following FCLT:

For each iI, let us fix a set of integer-valued nondecreasing nonnegative functions (sNij(l), l = 0,1,2,…), jJ, satisfying the following conditions:

The value of sNij(l) is interpreted as the number of type i customers routed to server j, out of the first l type i customers arriving in the system. Then it is clear that for each flow i, the functions sNij(·) define a fixed (“static”) pattern of routing customers to the servers in Ji, such that, for any l, the fractions of customers routed to different servers jJi closely track the values μijφiji; recall that [sum ]j μijφiji = 1. (The MinDrift rule does not require any knowledge of this static routing pattern; it is only used as a tool for the analysis!) Such functions can be, for example, constructed recursively as follows. We set sNij(0) = 0 for all jJi. For each l = 1,2,…, we set sNij(l) = sNij(l − 1) + 1 for one of the jJi with the smallest value of sNij(l − 1) − (μijφiji)l, and sNij(l) = sNij(l − 1) for all other j. (The “ties” between j are broken arbitrarily, for example, in favor of the smallest one.)

For iI, let us denote by

the total amount of server workload brought to the system by the new arrivals of flow i by time t ≥ 0 assuming the arrivals would be routed according to the (fixed) functions sNij(·). From (26)–(30) we obtain the following FCLT for the sequence of processes Air(·):

where

From (31) and (17) we obtain the following FCLT for the sequence of processes [sum ]i Air(·):

where

8. MAIN RESULTS

For each value of the (scaling) parameter

, let us consider the following processes. Let Ur(·) and qUr(·) be the (vector) processes, representing (unfinished) work and Q-estimated (unfinished) work, respectively, at different servers; let uXr(·), qXr(·), and Xr(·) denote the scalar processes representing server workload, Q-estimated server workload, and customer workload, respectively.

Assume that in a system with index

, each server j, at any time t, incurs a holding cost at the (instantaneous) rate of

where Cj(·) are fixed convex increasing functions, with the additional properties described in Section 4.

Note that in our asymptotic regime the cost function is “rescaled” as the parameter r changes. (In other words, in a system with index r, the holding cost rate corresponding to the unfinished work Ujr(t) is Cj(Ujr(t)/r) instead of Cj(Ujr(t)).) Notice, however, that in the special case (described in Section 4.4) when Cj(ζ) = γjζη+1, with some fixed η > 0 and γj > 0, the form of the corresponding MinDrift rule does not change with r. Indeed, in this case, replacing Cj′(Ujr(t)) in (2) with Cj′(Ujr(t)/r)/r simply does not change the routing rule.

For our main results, we need the notion of a fixed point. A vector uR+J will be called a fixed point if

for some constant c ≥ 0. If we recall that each derivative Cj′(·) is continuous strictly increasing with Cj′(0) = 0, one deduces the following:

A fixed point u corresponding to each c ≥ 0 exists and is unique. Moreover, u = 0 for c = 0, and uR++I (i.e., has all components strictly positive) for any c > 0.

Thus, the set of fixed points forms a one-dimensional manifold, which can be parameterized, for example, by the corresponding server workload values α*·u. In addition, it is easy to verify the following property:

A fixed point u is the unique vector that minimizes [sum ]j Cj(uj) among all vectors uR+J with the same server workload (i.e., satisfying condition α*·u = α*·u).

Indeed, if u = 0, the property is trivial. If uR++J, condition (35) implies that the (Lagrangian) function

has zero gradient (with respect to u) at point u. Since this Lagrangian is strictly convex in R+J, it is minimized by u. Then the desired property follows from the Kuhn–Tucker theorem.

Let us define the diffusion scaling operator

, which acts on a scalar function Ξ = (Ξ(t), t ≥ 0) as

and is applied to vertor-functions componentwise.

Let us consider the following diffusion-scaled processes:

.

8.1. Optimality of the MinDrift(U) Rule

Assume that the initial (scaled) amounts of unfinished work are deterministic and converging:

where

is a fixed point, as defined earlier. Consequently,

.

Let us define the following one-dimensional reflected Brownian motion

:

where B(·) is a standard Brownian motion,

and the drift a and diffusion coefficient σ are given in (34) and (32), respectively.

Theorem 1: Consider the sequence of queueing systems in heavy traffic, as introduced in Section 7. Assume initial condition (37). Let

be a reflected Brownian motion defined by (38) and (39).

(i) Suppose that the service discipline is such that the routing rule is MinDrift(U) with cost functions Cir(·), for each value of the parameter r, and each server employs an arbitrary work-conserving scheduling rule (namely the server is not allowed to idle when there is unfinished work in its queue). Then, as r → ∞,

where for each t ≥ 0, the vector

is the fixed point that is (uniquely) determined by

.

(ii) The service discipline defined in (i) is asymptotically optimal within the class of service disciplines satisfying condition (d0) in that it minimizes the server workload and the unfinished work cost rate at all times. More precisely, let

be the scaled unfinished work and server workload processes corresponding to an arbitrary service discipline G (and appropriately constructed on a common probability space with the sequence of processes defined in (i)). Then, with probability 1, for any time t ≥ 0,

and

As a corollary, with probability 1, for any T > 0,

The proof of Theorem 1 is the subject of Sections 9 and 10.

8.2. Optimality of the MinDrift(Q) Rule

Assume that, for each r, the initial state of the system at time 0 is deterministic and it conforms to conditions (d1) and (d2) on a service discipline (which are assumed in Theorem 2 below). In particular, for each pair of i and j, server j has in its queue at most one customer of type i whose service has already started, and the realizations of service times of the customers whose service has not yet started are not known to the server. For the initial residual service times vi,jr(0) (if any) of the customers whose service has already started, we assume

Finally, assume that the initial (scaled) amounts of Q-estimated unfinished work are converging:

where

is a fixed point, as defined earlier.

It follows from the above initial conditions that

and, in addition, with probability 1,

For the fixed

, we consider a one-dimensional reflected Brownian motion

, defined in (38).

Theorem 2: Consider the sequence of queuing systems in heavy traffic, as introduced in Section 7, and with the initial conditions described in Section 8.2.

(i) Suppose that the service discipline is such that the routing rule is MinDrift(Q) with cost functions Cjr(·), for each value of the parameter r, and each server employs an arbitrary work-conserving scheduling rule satisfying conditions (d1) and (d2). Then, as r → ∞,

where, for each t ≥ 0, the vector

is the fixed point that is (uniquely) determined by

.

(ii) The service discipline defined in (i) is asymptotically optimal within the class of service disciplines satisfying conditions (d0)–(d2) in that it minimizes both the server workload and the Q-estimated server workload and the unfinished work cost rate at all times. More precisely, let

be the scaled unfinished work, Q-estimated unfinished work, server workload, and Q-estimated server workload processes, respectively, corresponding to an arbitrary service discipline G satisfying (d0)–(d2) (and appropriately constructed on a common probability space with the sequence of processes in (i)). Then, with probability 1, for any time t ≥ 0,

and

As a corollary, with probability 1, for any T > 0,

and

The proof of Theorem 2 is essentially a slightly modified (and extended) version of that of Theorem 1. It is outlined in Section 11.

8.3. Customer Workload Minimization Under the MinDrift(Q) Rule

Suppose that we are within the conditions of Theorem 2. For the initial customer workloads

, we “automatically” (by (20) and (44)) have

Suppose, in addition, that in fact,

converges to the same limit as

:

which is equivalent to the condition that

for every nonbasic activity (ij).

Now, any service discipline in the OQ system, satisfying conditions (d0)–(d2), is within the class of disciplines for the corresponding IQ system studied in [12] (see Remark 2 in Section 3 of this article). In particular, Theorem 1 in [12] establishes that reflected Brownian motion (RBM) with exactly the same distribution as the RBM

(defined in this article by (38)) is in fact the stochastic lower bound of any limit of the customer workload process

, under any service discipline satisfying conditions (d0)–(d2) (but not necessarily IR), which includes service disciplines for both OQ and IQ systems.

However, from (20) we have a pathwise relation

and, by Theorem 2(i), under the MinDrift(Q) rule,

. Thus,

is both the lower and upper (stochastic) bounds of

and, therefore,

. We have proved the following result, which basically says that the MinDrift(Q) rule minimizes customer workload among virtually all service disciplines in either the OQ or IQ system.

Theorem 3: Suppose that the conditions of Theorem 2 and, in addition, condition (50) hold.

(i) For the service discipline described in Theorem 2(i), we, in addition, have

(ii) The service discipline described in Theorem 2(i) asymptotically minimizes customer workload among all service disciplines, satisfying conditions (d0)–(d2), in either the OQ or IQ system; namely, the customer workload process

under any such discipline G can be constructed on a common probability space with the RBM

, so that, with probability 1, for any time t ≥ 0,

8.4. A Necessary Condition for Workload Minimization Under Any Service Discipline: Vanishing Nonbasic Queues

Theorems 2 and 3 show that, roughly speaking, the MinDrift(Q) rule minimizes both server and customer workload in the heavy traffic limit. Theorem 4 demonstrates that (either customer or server) workload in an OQ system can be minimized by some discipline only if, under this discipline, the (scaled) queue lengths

corresponding to nonbasic activities (ij) vanish in the limit.

Theorem 4: Suppose that the conditions of Theorem 2 and, in addition, condition (50) hold. Suppose that under some service discipline in the OQ system, satisfying conditions (d0)(d2), either (52) or

or

holds. Then, for any t ≥ 0,

The proof is presented in Section 12.

9. FLUID SAMPLE PATHS UNDER THE MinDrift RULE

9.1. Fluid Sample Paths: Definition and Basic Properties

In this section, we study the sequence of processes introduced in the previous section under the fluid (or “law of large numbers”) scaling and under the MinDrift rule. More precisely, we need to consider only sample paths of the processes under this scaling and then their limits, which we formally define here and call fluid sample paths (FSPs). The key property of FSPs that must be established (in Theorem 5) is that as time increases to infinity, the queue length vector converges to a fixed point. This attraction property is used to prove (in Section 10) the state space collapse property (i.e., the property that the limit of the sequence of diffusion scaled processes is a process “living” on the manifold of fixed points).

Throughout Section 9 we assume the CRP condition. However, all definitions and results of this section hold under the weaker FO-CRP condition, with arbitrary fixed φ satisfying (12) and

; in other words, the FSP definition and key properties do not require a solution φ of (12) to be unique.

First, we introduce some additional (random) functions, associated with the process for each value of the scaling parameter r. (The functions Fir(t), Qir(t), and Xr(t), have already been defined earlier.) Denote by Gijr(t) the amount of time within [0,t] that server j was serving type i customers. For each pair (i,j) we define Nijr(n) as the number of type i arrivals actually routed to server j, out of the first n new type i arrivals, and denote

Here, for any pair (i,j), sNijr(·) ≡ sNij(·) for all r, where sNij(·) are nonrandom functions fixed earlier and satisfying conditions (28)–(30).

We define the (server workload) regulation process as follows:

Function Yidler is the regulation component due to physical idleness of the servers, Yrouter is the regulation component due to (possible) routing of customers to nonbasic servers, and functions Yroute,ir represent the contributions into Yrouter due to different flows iI.

The regulation component Yidler(t) is clearly nonnegative and nondecreasing. It is easy to verify that each regulation component Yroute,ir(t) is also nonnegative and nondecreasing. (Also, so is, therefore, Yrouter(t).) Moreover, Yroute,ir(t) is a piecewise constant function, which is constant between type i customer arrivals, and jumps by the value αj*/μij − νi* ≥ 0 when a type i arrival is routed to server j; note that the size of the jump is strictly positive if and only if server j is nonbasic for type i. Finally, note that Yr(t) does not increase over some time interval if and only if, during that interval, none of the servers idles and all new arrivals are routed to the corresponding basic servers.

We record the above facts (along with their obvious generalization) in the following lemma for future reference.

Lemma 4: For each value of the scaling parameter r, consider a pair of time points 0 ≤ t1r < t2r < ∞ and denote

Then B0r = 0 if and only if B1,ir = 0 for all i and B2,jr = 1 for all j. Also, limr→∞ B0r = 0 if and only if limr→∞ B1,ir = 0 for all i and limr→∞ B2,jr = 1 for all j.

Let us consider the process Zr = (Ur, uXr,Fr, ΣVr, sNr,Nr,Gr,Hr,Yr,Yidler,Yrouter), where

For each r consider the fluid scaled process

where the fluid scaling operator Γr is applied componentwise and acts on a scalar function Ξ = (Ξ(t), t ≥ 0) as follows:

Definition 4: A fixed set of functions z = (u, ux,f, Σv, sn,n,g,h,y,yidle,yroute) will be called a fluid sample path (FSP) if there exists a sequence

of values of r and a sequence of sample paths (of the corresponding processes) {zr} such that, as r → ∞ along the sequence

,

and, in addition,

Remark: A sequence

, the existence of which is required in Definition 4, may be completely unrelated to the sequence

we introduced earlier in the definition of the heavy traffic regime.

The following lemma establishes some basic properties of FPSs. We omit the simple proof, which is a direct consequence of the definitions involved.

Lemma 5: For any FSP z, all of its component functions are Lipschitz continuous and, in addition,

Furthermore, both y(·) and ux(·) are nondecreasing (with y(0) = 0).

Since all component functions of an FSP are Lipschitz, they are absolutely continuous, and therefore for almost all points tR+ (with respect to the Lebesgue measure), the following property holds:

Each component function of z has (finite) first derivative at t, and each function nij(·) has (finite) first derivative at λi t.

We refer to such time points t as regular. We adopt a convention that t = 0 is not a regular point (i.e., in the definition of regular points, we require that proper derivatives exist).

The dynamics of u(t) satisfies the following differential equation and additional conditions at every regular point t:

where the components of the J-dimensional vectors ρin(t) and ρout(t) are defined as

and for ρout, we have

9.2. Uniform Attraction of Fluid Sample Paths

For uR+J, denote

Φ(u) [esdot ] 1 − * A(u)/*A(u) if u ≠ 0, and Φ(0) [esdot ] 0 by convention.

Consider the following functions associated with a fixed FSP. First, define

and, similarly, J*(t) (with *A replaced by * A). Next, introduce

and note that *uj(t) is well defined since each function Cj′(·) is strictly increasing continuous. Let *x(t) [esdot ] α*·*u(t), where *u(t) = (*u1(t),…,*uJ(t)), and note that ux(t) ≤ *x(t) for all t ≥ 0.

Finally, note that, at any time t, the following five conditions for u(t) are all equivalent:

  1. u(t) is a fixed point.
  2. *A(u(t)) = * A(u(t)).
  3. Φ(u(t)) = 0.
  4. *x(t) = ux(t).
  5. *u(t) = u(t).

The following sequence of lemmas establishes further properties of fluid sample paths, which are less obvious than the basic properties of Lemma 5. The form of the MinDrift rule is used in the proofs in an essential way.

Lemma 6: Consider a fixed FSP z. Suppose t > 0 is a regular point and u(t) ≠ 0. Then the following properties hold at this t:

(i) uj(t) > 0 for all jJ.

(ii) We have

(iii) Moreover, there exists a constant ε1 > 0, which depends on system parameters only, such that if, in addition, u(t) is not a fixed point (i.e., *A(q(t)) > * A(q(t))), then

Proof: Let us first prove (iii). Thus, consider regular time point t > 0 and suppose that *A(u(t)) > * A(u(t)). The following observation is true:

Indeed, according to the MinDrift rule and Lemma 3(ii)(c), for all sufficiently large r, the prelimit path zr is such that in a small interval [t,t + ε], ε > 0, new arriving customers of type i cannot be routed to a server jJ*(t)[setmn ]Ji. This easily implies that the corresponding FSP component nij(·) cannot increase in a small interval to the right of λi t, and therefore nij′(λi t) = 0 since t is regular. Using a similar argument, it is also easy to prove the following property:

Let us denote by I*(t) the (nonempty) subset of types i such that JiJ*(t) ≠ [empty ]. Since graph

is connected, there exists at least one iI*(t) such that Ji[setmn ]J*(t) ≠ [empty ], in which case (by (65)), we have strict inequality:

If iI*(t) and Ji[setmn ]J*(t) = [empty ] (i.e., JiJ*(t)), then

Indeed, using (64), the fact that αj* μij−1 is the same across all jJi, [sum ]jJ nij′(λi t) ≤ 1, and [sum ]jJi μijφiji = 1, we can write

As a corollary from (64), we also obtain the following property:

We will now show that

where ε > 0 depends only on the subset J*(t). Indeed,

and (using (68), (66), and (67)) we have

We have proved (69), with ε > 0 depending only on the subset J*(t) ⊂ J. Since there is only a finite number of subsets of J, we have proved the first inequality in (63), with some fixed ε1 > 0.

The second inequality in (63) is proved analogously. We denote by I*(t) the (nonempty) subset of types i such that JiJ*(t) ≠ [empty ]. Then we use the following property (obtained using the argument analogous to that leading to (64) and (65)):

We omit details.

The proof of the nonstrict inequalities in property (ii) is a straightforward extension of the proof of (iii); namely we need to consider an additional (degenerate) case when J*(t) = J*(t) = J. In this case, for example to prove the first inequality in (62), we observe that the nonstrict inequality (67) always applies, and (70) holds with the strict inequality replaced by a nonstrict one.

Finally, (i) is proved by contradiction. Suppose, uj(t) = 0 for some jJ. Obviously, the set of such j is exactly J*(t). Since u(t) ≠ 0, u(t) is not a fixed point. Therefore, the second inequality in (63) should hold. However, this is impossible, because we must have uj′(t) = 0 for all jJ*(t). Indeed, the condition uj(t) = 0 and the existence of uj′(t) imply that uj′(t) = 0. (Otherwise, uj(·) would be negative just before or right after time t.) █

Lemma 7: Consider a fixed FSP. Suppose a time interval [t1,t2], with 0 ≤ t1 < t2, is such that

Then, over [t1,t2], the functions *A(q(t)), * A(q(t)), *x(t), and *uj(t) for all jI are Lipschitz continuous. Moreover, for almost all t ∈ [t1,t2],

and if, in addition, *A(u(t)) > * A(u(t)) (i.e., u(t) is not a fixed point), then

where ε1 > 0 is defined in Lemma 6.

Proof: First, the Lipschitz continuity of each function Cj′(ui(t)) in [t1,t2] follows from Lipschitz continuity of uj(·) and the fact that, for the range of possible values of uj(t) in [t1,t2], Cj′′(·) is continuous bounded away from both infinity and zero. (This is the only place where we use the assumption that the functions Ci(·) are twice continuously differentiable.)

This implies that for an arbitrary fixed subset

, the following functions are also Lipschitz continuous in [t1,t2]:

In particular, *A(q(t)) and * A(q(t)) are Lipschitz, which (along with the fact that the second derivatives Cj′′(·) are bounded away from zero) implies that all *uj(t) and *x(t) are Lipschitz. We see that almost all points t ∈ [t1,t2] are regular (as defined earlier) and, in addition, are such that all the max and min functions in the last display, for all (nonempty) subsets

, have derivatives. Within the present proof, let us call such points t strictly regular.

Consider an arbitrary strictly regular point t ∈ [t1,t2]. The proof will be complete once we prove (71) and (72) for this point t. Since t is strictly regular, the derivatives d/dt[*A(u(t))] and d/dt[Cj′(uj(t))/αj*] for jJ*(t) are all equal. (In particular, this implies that *uj′(t) = uj′(t) for all jJ*(t).) We cannot have d/dt[*A(u(t))] > 0 because this would imply that uj′(t) > 0 for all jJ*(t), which would contradict (62). This proves the first (and, with it, the last) inequality in (71). The second inequality in (71) is proved analogously.

We can now write

where the inequality follows from the fact that *uj′(t) ≤ 0 for all jJ (which is implied by (71)), and the second equality is because *uj′(t) = uj′(t) for jJ*(t). In the case *A(q(t)) > * A(q(t)), by (63), the RHS of the above display is bounded above by −ε1, which proves (72). █

Lemma 8: Consider a fixed FSP z. Suppose u(t1) ≠ 0 for some t1 ≥ 0. Then u(t) has all strictly positive components (i.e., u(t) ∈ R++J) for all t > t1. Moreover, in [t1,∞), * A(q(t)) is nondecreasing, and both *A(q(t)) and *x(t) are nonincreasing.

Proof: Indeed, we can always find a regular point ξ > t1 arbitrarily close to t1 so that u(ξ) ≠ 0. By Lemma 6, u(ξ) ∈ R++J. Then, using Lemma 7, it follows that * A(u(t)) is nondecreasing (and *A(u(t)) and *x(t) are nonincreasing) starting from time ξ, and therefore u(t) ∈ R++J for all t ≥ ξ. Since ξ can be chosen arbitrarily close to t1, the proof is complete. █

Lemma 9: Consider a fixed FSP z. If u(0) = 0, then u(t) = 0 for all t ≥ 0.

Proof: Suppose not. By continuity of *x(·), for an arbitrarily ε > 0, there exists time t1 > 0 at which *x(t) reaches level ε for the first time. Of course, u(t1) ≠ 0. By Lemma 8, *x(t) cannot increase starting at time t1, and therefore *x(t) ≤ ε for all t ≥ 0. Since ε > 0 can be chosen arbitrarily small, *x(t) = 0, and therefore u(t) = 0 for all t ≥ 0. █

The following theorem easily follows from the lemmas presented earlier in this subsection.

Theorem 5: For any fluid sample path, Φ(u(t)) is a nonincreasing function, and the server workload ux(t) is a nondecreasing function. Moreover, there exist fixed constants T1 > 0 and K ≥ 1 such that, for any FSP, u(t) reaches a fixed point u within finite time ux(0)T1 and then stays there, and α*·uux(0)K.

Proof: The fact that x(t) is nondecreasing has already been established earlier.

Suppose u(0) ≠ 0. By Lemma 8, * A(u(t)) is nondecreasing and *A(u(t)) is nonincreasing in [0,∞), and therefore Φ(u(t)) is nonincreasing. Further, by Lemma 8, u(t) ∈ R++J for all t > 0. Then, by Lemma 7, for almost all t > 0, *x(t) > ux(t) implies

Since *x(0) ≤ ux(0)[[sum ]j αj*]/[minj αj*], ux(t) ≤ *x(t), and ux(t) is nondecreasing, we immediately see that u(t) must reach a fixed point within a time proportional to ux(0).

Therefore, the statement of the theorem, with some fixed T1 > 0 and K ≥ 1, holds for the FSPs with u(0) ≠ 0. By Lemma 9, it trivially holds for u(0) = 0 as well. █

For future reference, we record the following property of prelimit paths.

Lemma 10: There exists a constant ε2 > 0 such that the following holds. For any prelimit (scaled) path ur = (ur(t), t ≥ 0), and 0 ≤ t1r < t2r < ∞, the property

implies that yr(t2r) − yr(t1r) = 0 or, equivalently, that in the (scaled) interval (t1r,t2r], all new arriving customers are routed to their corresponding basic servers and none of the servers idles.

Proof: First, since ur(t) ≠ 0 in [t1r,t2r], none of the servers idles in this time interval. Second, a small value of Φ(ur(t)) implies that the vector (C1′(u1r(t)),…,CJ′(u1r(t))) is “almost proportional” to vector α*. Thus, if Φ(ur(t)) is small, it follows from the form of the MinDrift(U) rule and Lemma 3(ii)(c) that in [t1r,t2r], a new arrival of any type iI can only be routed to a server jJi. We omit the ε-δ formalities. █

10. PROOF OF THEOREM 1

For each

, consider the following process, obtained by diffusion scaling:

where the diffusion scaling operator

is defined in (36).

To prove the properties stated in Theorem 1, it will suffice to show that for any subsequence

there exists another subsequence

such that these properties hold when r → ∞ along

. As in [14], to do this, we will choose subsequence

and construct all processes (for all

) on the same probability space in a way such that the desired properties hold with probability 1 (or are implied by certain probability 1 properties).

Let us fix an arbitrary subsequence

of indices {r}. According to Skorohod's representation theorem (see, for example, [7]), for each i, the sequences (on r) of the processes {Fir} and {ΣVijr}, jJ, can be constructed on a probability space such that the convergence in (31) holds u.o.c. with probability 1 (w.p.1):

where Bi is a standard Brownian motion.

We can and do assume that our underlying probability space Ω = {ω} is a direct product of the above I probability spaces. (Without loss of generality, we assume that this probability space is complete.) On this probability space the convergence (33) holds u.o.c. w.p.1 as well:

where B is a standard Brownian motion.

Now, from condition (25) and Bramson's weak law estimates ([4, Prop. 4.3]), we know that for any T3 > 0, any ε > 0, and any i, for all large r, we have (see the proof of property (5.19) in Proposition 5.1 of [4])

(The max in (75) and (76), as well as in (76)–(78), is over integers l ∈ [0,T3 r].) Also, using Proposition 4.2 of [4], it is easy to show (similarly to the derivation of property (71) in [14]) that for any T3 > 0, any ε > 0, and any pair of (i,j), for all large r, we have

Estimates (75) and (76) enable us to choose a subsequence

, such that as r → ∞ along

, with probability 1, for any T3 > 0 we have

and

Properties (77) and (78), in turn, imply the following property.

With probability 1, for any fixed T4 > 0 and d > 0, for any (i,j), we have the following:

Uniformly on any sequence of pairs

, such that 0 ≤ t1r < t2rr2T4, t2rt1rrd,

Uniformly on any sequence of pairs

, such that 0 ≤ l1r < l2rr2T4, l2rl1rrd,

For each jJ, we have

and therefore the expression for the scaled server workload can be written as

where

is the term (82),

denotes the term (83),

denotes the term (84), and

. We know from (74) that

where

B(·) is the realization of a standard Brownian motion, and the parameters a and σ are those defined in (34). (The realization

is, of course, continuous.) As seen from (85), the key step in proving Theorem 1 will be the proof of the following convergence:

In the rest of this section, we restrict ourselves to a (measurable, probability 1) subset Ω2 ⊆ Ω of elementary outcomes ω, such that all the specified above probability 1 properties hold, when r → ∞ along

.

Lemma 11: Consider a fixed ω ∈ Ω2. As r → ∞ along

, the functions

, and then

, areasymptotically closein the following sense. For any fixed T4 > 0 and any fixed δ1 > 0 and δ2 > 0, for all sufficiently large r, uniformly on t ∈ [0,T4],

and then

The proof of Lemma 11 is analogous to the proof of Lemma 9 in [12]. The key observation here (which follows from (79) and (80)) is that, for fixed T4 > 0 and (arbitrarily small) d > 0, if t ∈ [0,T4] and |Hijr(Fir(r2t))| ≥ rd, then for all sufficiently large r, the ratio

is close to unity. We do not present the details. We note that (analogous to the situation with Lemma 9 in [12]) Lemma 11 applies to any service discipline satisfying condition (d0), and the uniqueness of φ (in the CRP condition) is used in the proof of Lemma 11 in an essential way.

It follows from Lemma 11 that to prove (86), it suffices to prove

because

is bounded on finite intervals.

Since regulation

is a nondecreasing function (for any r), for any fixed ω ∈ Ω2, from any subsequence

(which may depend on ω!), it is always possible to find a further subsequence

such that

where

is some nondecreasing RCLL function. (We will prove that this limit

is indeed the regulation of the one-dimensional Brownian motion defined earlier.) In principle,

may take the values +∞. (In other words,

.) Recall that the notation “⇒” stands for convergence at every point of continuity of the limit function except maybe the point 0.) We note that (90) implies that

and therefore

if and only if

.

The following lemma (and its proof) is analogous to Lemma 7 in [14] and Lemma 10 in [12]; it contains key observations used in the proof of Theorem 1. The key construction of the proof, which involves “slowing down” the diffusion scaled process to consider a family of processes on the “fluid” time scale and then expoiting the uniform attraction property of fluid sample paths, is essentially the same as that in Section 5 of [4]. This construction is central for establishing SSC in the heavy traffic asymptotic regime for multiclass queuing networks (see [4,19]).

Lemma 12: Suppose that the service discipline is such that the routing rule is MinDrift(U) and scheduling rules at the servers are work-conserving. Suppose that ω ∈ Ω2 and a subsequence

are fixed such that, along this subsequence, (90) holds. Suppose, a sequence

is fixed such that

Let δ > 0 be fixed and

Then the following hold:

(a)

is finite in [0,t′ + δ].

(b)

does not increase in

.

(c) The following bound holds

with K defined in Theorem 5.

(d) For any δ′ > 0,

where

is the (unique) fixed point such that

.

If, in addition,

for all r, and

, where

is a fixed point (necessarily, with

), then the following hold:

Proof: The proof essentially repeats that of Lemma 10 in [12]. For completeness, and since some adjustments are required, we present it here.

Let us consider the functions of interest on the fluid time scale; namely consider earlier defined functions

, and similarly defined function wr and other related ones.

Let us choose a fixed T > 0 as follows. Let us fix ε3 ∈ (0,C − ε), denote

and fix arbitrary

where K and T1 are the constants defined in Theorem 5. As seen later in the proof, C3 will be the upper bound of

in the interval

or, equivalently, the upper bound of uxr(·) in the interval

. Thus, the choice of the constant T is such that an FSP with initial server workload not exceeding C3 will converge to a fixed point within time T.

For each integer l ∈ [0,2δr/T], consider

and similarly defined wr,l , yr,l , and other related functions.

Let us fix arbitrary ε4 ∈ (0,ε2 /2), where ε2 is defined in Lemma 10. Then the following property holds.

Property 1: For all sufficiently large r, relation (92) below holds for all integer l ∈ [0,2δr/T], and relations (93)–(95) hold for all integer l ∈ [1,2δr/T]:

To prove Property 1, we first observe that (92) must hold for l = 0 for all large r, because otherwise we would be able to choose a subsequence of indices r along which the sequence of paths zr,0 converges to an FSP z with ux(0) = C and either ux(ξ) > CK or ux(ξ) < C for some ξ ∈ [0,T], which contradicts Theorem 5. Moreover, this observation shows that in fact for all large r,

and, given our choice of the constant T,

Next, suppose Property 1 does not hold. Then we can choose an (infinite) subsequence of r such that (along this subsequence) l′ = l′(r) is well defined as the smallest l ≥ 1 such that one of the conditions (92)–(95) does not hold. (Note that, by this construction and (97), property (93) always holds for l = l′ and ξ = 0.) We will show that this construction leads to a contradiction.

Indeed, for all large r, both (92) and (93) hold for l = l′ and ξ = 0. This follows from the combination of the following facts:

  1. Property (96).
  2. as long as ξ12 ∈ [t′ − 3δ,t′ + 3δ] ∩ R+.
  3. uniformly in [t′ − 3δ,t′ + 3δ] ∩ R+.
  4. Property (95) for each 1 ≤ ll′ − 1.
  5. The functions
    are asymptotically close (in the sense of Lemma 11).

Since (92) and (93), with l = l′ and ξ = 0, hold for large r, we see that (93) and (94) hold for l = l′ and all large r. (Otherwise, we would be able to choose a subsequence of r along which zr,l converges to an FSP z, violating Theorem 5.) Similarly, the lower bound in (92) must hold (for large r) for l = l′ and ξ ∈ [0,T]. This, in conjunction with (94) and Lemma 10, means that (95) holds for l = l′ (for large r). Finally, this and the argument we already used to prove bound (92) for l = l′, ξ = 0, shows that in fact (92) holds for l = l′ and all ξ ∈ [0,T] (for large r). We have proved that (92)–(95) hold for l = l′(r) for all large r. This is a contradiction with the construction of the function l′ = l′(r), which proves Property 1.

Property 1 (namely (92)) implies that, for all large r,

Statements (a)–(c) of the lemma follow from this estimate.

To prove (d), we first notice that (a), (b), and Lemma 11 imply the following uniform convergence for the workload process:

Statement (d) then follows from Property 1, the fact that ε4 can be chosen arbitrarily small, and convergence (98).

To prove properties (c′) and (d′), we use the same exact construction. It is easy to see that, under the additional assumptions, all conditions (92)–(95) in Property 1 hold for all integer l ∈ [0,2δr/T] (including zero). Given this, properties (c′) and (d′) are proved analogously to properties (c) and (d). We omit details. █

The rest of the proof of Theorem 1 repeats that of Theorem 1 in [14], and that of Theorem 1 in [12], virtually verbatim. We reproduce it here, with the necessary minor adjustments, for completeness.

10.1. Proof of Theorem 1(i)

To prove this part it suffices to prove the following:

Property 2: As r → ∞ (along

), for any ω ∈ Ω2 (i.e., with probability 1), we have the following convergence:

where

is defined by (39), and

where for each

is the fixed point such that

.

Proof of Property 2: Let us fix ω ∈ Ω2. As explained earlier, for an arbitrary subsequence

there exists another subsequence

such that the convergence (90) holds along this subsequence. Then, the proof of Property 2 will be complete if we can prove the following statements (for the chosen ω, with r → ∞ along

. We recall that, at this point, the function

is just some limit function—the fact that it is equal to the function defined by (39) is what needs to be proved in order to establish (100).

Step 1. The limit function

is finite everywhere in [0,∞).

Step 2. The function

is continuous, and

.

Step 3. If

, then t is not a point of increase of

.

Step 4. The function

, defined above as a limit, satisfies (39).

Step 5. Convergence (100) holds.

In this proof, we will use the convention that

. So, the case

will be viewed as a discontinuity of y (and x) at 0. Also, we will use the notation

Proof of Step 1. Suppose the statement does not hold. Denote

. The inf is attained because

is RCLL.

We choose a small δ such that δ ∈ (0,t*) if t* > 0, and arbitrary δ > 0 if t* = 0. Let us fix ε = ε(4δ,t*). Then we choose a small Δt ∈ (0,δ) and a large C such that

. We define

and choose a further subsequence of {r} such that

(We must have t′ ≤ t*, because the limit function

, and therefore

, is infinite for all tt*.) It is also easy to see (from (77)) that

The conditions of Lemma 12 are satisfied, and so y is bounded in [t′,t′ + δ] —a contradiction, since t′ + δ > t*. Step 1 has been proved.

Proof of Step 2. Suppose that the statement does not hold. The contradiction is obtained very similarly to the way it is done in the proof of Step 1. Let t* be a discontinuity point (the case t* = 0 is included) (i.e.,

). Since

is continuous,

. There are two possible cases:

Case (a). In this case, we must have t* > 0. (Indeed, by the definition of

and our conventions,

. If

, then, by Lemma 12(c′),

, which means that

, and therefore

, has no jump at 0. If

, then

.) We can always fix a small δ > 0 and small Δt ∈ (0,δ), such that t′ = t* − Δt is a point of continuity of

(and

) and

. We have convergence

(since

is continuous at t′), and by Lemma 12,

cannot increase in the interval (t′,t′ + δ] which contains t*. So,

cannot have a jump at t*.

Case (b). In this case, let us fix a small C > 0 and then a sufficiently small δ > 0 so that

where ε = ε(4δ,t*) and K ≥ 1 is defined in Theorem 5 (and used in Lemma 12). Then if t* > 0, we fix a small Δt such that

If t* = 0, we fix an arbitrary Δt > 0. We define

and choose a further subsequence of {r} such that

The conditions of Lemma 12 are satisfied, and so

for all t ∈ [t′,t′ + δ], which contradicts the assumption of case (b), since t* belongs to the latter interval. Step 2 has been proved.

Proof of Step 3. Let t* ≥ 0 be such that

. If t* = 0, then the fact that

does not increase in a small interval [0,δ] follows from Lemma 12(b′). If t* > 0, then precisely the same construction as in the proof of Step 2(a) shows that

does not increase in a small interval [t′,t′ + δ] containing t* in its interior. Step 3 has been proved.

Proof of Step 4. Follows from the statements of Steps 2 and 3 and Proposition 1 (in the Appendix).

Proof of Step 5. It suffices to show the following:

For any t* ≥ 0 and any ε > 0, there exists δ > 0 such that

(The u.o.c. convergence will then follow from the Heine–Borel lemma.)

If

, then (101) must hold because both functions

(for large r) are bounded by an arbitrarily small constant in a sufficiently small neighborhood of t*. If

, then (101) follows from Lemma 12(d′). If

, then to obtain (101) we can repeat the construction of the proof of Step 2(a) and then apply Lemma 12(d). Step 5 has been proved.

Thus, the proof of Property 2, and with it the proof of statement (i) of the theorem, is complete. █

10.2. Proof of Theorem 1(ii)

We use the same construction of the probability space Ω, the subsequence

, and the probability 1 subset Ω2, as specified earlier. Consider an arbitrary discipline G. Sample paths for both the Gcμ and G disciplines are constructed on this common probability space. For ω ∈ Ω2, consider paths of

, corresponding to the discipline G. Since

is invariant with respect to the discipline,

, and therefore

u.o.c.

We claim that, along the subsequence

, for any t ≥ 0,

and therefore (40) holds. To prove this, we first recall that Lemma 11 holds for any discipline G satisfying condition (d0).

For any subsequence

, we can choose a further subsequence

such that

, where

is some nondecreasing nonnegative RCLL function. (The case that

takes value +∞ starting from some finite time t* is possible.) Therefore, for any t ≥ 0 where

is continuous, as r → ∞ along

,

Since

is nonnegative, we see that

is nonnegative at every point of continuity of

, and therefore it is nonnegative for all t ≥ 0 (by right-continuity). Then, by Proposition 1(ii) (in the Appendix),

for all t ≥ 0. Since

is continuous and nondecreasing and

and all

are nondecreasing, for any t ≥ 0 we obtain the uniform bound

By Lemma 11, we have an analogous bound for

as well:

which proves (102), with r → ∞ along subsequence

, and therefore along

as well (since the subsequence

can be arbitrary). The proof of (102) (and therefore (40)) is complete.

Since the function [sum ]j Cj(uj) is continuous in the vector u, and the fixed point

in (41) minimizes the value of [sum ]j Cj(uj) over vectors u with server workload

, property (41) also holds. Finally, the equality in (42) follows from the fact that

u.o.c., and the inequality follows from (41) and Fatou's lemma.

The proof of Theorem 1 is now complete.

11. PROOF OF THEOREM 2

The proof of Theorem 2 is a relatively straightforward extension of that of Theorem 1. The extension is based on the fact that, given the assumptions of Theorem 2(ii), the processes

are in fact “asymptotically close” (see (103)). As a result, the behavior of the system (in the diffusion limit) under MinDrift(Q) is the “same” as that under MinDrift(U). In this section, we provide a detailed sketch of such a proof extension. We believe the details can be easily filled in by a reader.

Construction of the probability space and subsequences

. For the proof of Theorem 2, we assume that, for each r, the service times of the “initial customers” of type i at server j, whose service has not yet started at initial time 0, are given by an i.i.d. sequence vijr(n), n = 1,2,…. Thus, the sequence {vijr(n)} is separate from the sequence {vijr(n)}, defining service times of customers arriving after time 0, but, of course, vijr(1) has the same distribution as vijr(1). We denote by

the total amount of unfinished work “contained” in the the first n (in the order of them being taken for service) initial type i customers at the server j. As with other functions, we extend the domain of ΣVijr(·) to all real nonnegative t ≥ 0 and denote its fluid-scaled version by Σvijr = Γr ΣVijr.

The underlying probability space is the same as in the proof of Theorem 1, except it is augmented by taking a direct product with the space on which the sequences (on r) of the processes {ΣVijr} are defined. The subsequence

is defined exactly the same way. The property analogous to (78) holds for the processes Σvijr, as well as Σvijr. Then, the subsequence

can be chosen in a way such that, additionally, the properties analogous to (78) and (80) hold for the processes Σvijr and ΣVijr, respectively.

“Asymptotic closeness” of

. Using (78) and (80), and their analogs for Σvijr and ΣVijr, it is easy to demonstrate the following property, which holds for any service discipline within the class specified in Theorem 2(ii):

As r → ∞ along

, with probability 1, for any T3 > 0 and any ε > 0, we have

This property is the key in showing that, in the heavy traffic limit, MinDrift(Q) induces the same system behavior as the MinDrift(U).

Definition and properties of the FSPs under MinDrift(Q). The process Zr is augmented by the following components:

The fluid-scaled process zr and an FSP z are augmented by the corresponding components qr, Σvr, qur, qxr, and q, Σv, qu, qx, respectively. The definition of the FSP is the same, except it includes the additional conditions

and an analog of (57) for Σvijr. This augmented definition of an FSP easily yields the following additional FSP property (which can be added into Lemma 5):

which, of course, also implies qx(t) = ux(t), t ≥ 0. Using (104), it can be easily shown that all of the FSP properties established for MinDrift(U) hold for MinDrift(Q) as well.

Proof of Theorem 2. Given property (103) and the fact that FSPs under MinDrift(Q) satisfy all of the properties of FSPs under MinDrift(U) (plus (104)), the rest of the proof is the same as that of Theorem 1.

12. PROOF OF THEOREM 4

Before proceeding with the proof, note that, in addition to (51), we have another pathwise relation (see (21) and (22)):

Proof: The argument leading to Theorem 3 shows that, given the conditions of Theorem 4, either of the convergences (53) or (54) implies (52). So, it will suffice to prove that (52) implies (55).

Assume (52). Suppose, for some t1 ≥ 0, property (55), with t = t1, does not hold. This implies that, in addition to the pathwise inequality

(see (51)), we have, for some fixed constant c > 0,

Consider a subsequence of indices r along which the distributions of both

(weakly) converge to some distributions, which we denote η and qη, respectively. Distribution η is necessarily equal to the distribution of

, and qη (stochastically) dominates η and is not equal to η, which follows from (106).

Consider the processes

restarted at time t1. Let us fix arbitrary t2, t1 < t2 < ∞. Then we have

where

Since pathwise inequality (105) holds, we see that (107) holds for the process

as well:

Consider now an RBM

with the drift a and diffusion coefficient σ (same as for the RBM

), defined within interval [t1,t2], with the initial distribution qη at time t1. Since qη strictly dominates η,

Using Theorem 2(ii), it is easy to see that the RBM

is an asymptotic (stochastic) lower bound of the sequence of processes

(in the sense specified in Theorem 2(ii)). From this fact we see that for any δ > 0, we can choose a sufficiently small ε > 0, so that

If we fix δ ∈ (0,p1p2) and a corresponding ε > 0 as above, we obtain the following from the estimates (110) and (108):

a contradiction, which completes the proof. █

13. STABILITY VERSUS HEAVY TRAFFIC WORKLOAD MINIMIZATION

Consider a special case of the MinDrift routing rule, with cost functions Cj(ζ) = (½)ζ2 (see Section 4.4). The corresponding MinDrift(Q) rule is as follows: route an arriving type i customer to a server j such that

Our heavy traffic results apply to this rule. (The form of the rule does not change with r, as explained in Section 8.)

Using the approach of [1,2,6,14,15], it is not hard to show that under this routing rule (plus arbitrary scheduling rules satisfying (d1) and (d2)), both the queue length process ((Qij(t), iI, jJ), t ≥ 0) and the unfinished work process ((Uj(t), jJ), t ≥ 0) are stable, as long as the vector of mean rates λ is within the system stability region

, defined in Section 5. In fact, as explained in [14], the analysis of the FSPs (Section 9), required to establish the heavy traffic results, is essentially a “superset” of the analysis needed to prove stability. (We do not provide details of the stability proof, as it is not the focus of this article.)

Thus, the above rule is able to both keep queues stable (as long as

) and minimize system workload in the heavy traffic limit. The Gcμ scheduling rule for the IQ system possesses the same property (see [12]) and so does the MaxWeight scheduling rule for a different, but closely related, “generalized switch” model (see [14]). All of these results may suggest the intuition that a dynamic service discipline that keeps queues stable (as long as

), “typically” will also minimize system workload in heavy traffic. Such a “conjecture” cannot be formally correct, because it is not hard to devise some contrived disciplines, for which it does not hold. We note, however, that this conjecture does not hold even for very natural service disciplines, as the following example demonstrates.

Consider a service discipline for our OQ system, which strives to minimize the drift of the cost (Lyapunov) function

Then the discipline has the following form. (It is close to the class of network scheduling disciplines introduced in [15].)

Routing rule (“Join the shortest queue of your type”): Route an arriving type i customer to a server j such that

Scheduling rule (“Gcμ within each server”): Server j picks a customer of type i such that

This discipline ensures stability of the queues, when

. (Again, the approach and techniques of [1,2,6,14,15] can be applied.)

However, it is not hard to see that, under this discipline and under the conditions of Theorem 4, condition (55) cannot possibly hold, as long as μij > 0 for at least one nonbasic activity (ij). We do not provide a formal proof here. The key part of a proof is showing the following very intuitive fact, which is implied by the nature of the routing rule: FSPs under this discipline are such that if the initial workload is nonzero, then after some finite time, all nonbasic queue lengths are bounded away from zero. We also exploit the fact that since the limiting workload process is lower bounded by an RBM, at any time t > 0 the limiting workload is nonzero with nonzero probability. Thus, by Theorem 4, none of the workload minimization properties (52)–(54), can hold.

Acknowledgment

The author is very grateful to Avi Mandelbaum for bringing output-queued flexible server systems to his attention and encouraging this work.

APPENDIX: The One-Dimensional Skorohod Problem

The following proposition describes standard properties of solutions to the one-dimensional Skorohod problem. (See, for example, [5] for the proof. The proof is also contained in the proof of Theorem 5.1 of [18]).

Proposition 1: Let w = (w(t), t ≥ 0) be a continuous function in D([0,∞),R) such that w(0) ≥ 0. Then the following hold:

(i)There exists a unique pair (x,y) of functions in D([0,∞),R), such that the following hold:

(a) x(t) = w(t) + y(t) ≥ 0, t ≥ 0.

(b) y is nondecreasing and nonnegative.

(c) y(0) = 0.

(d) For any t ≥ 0, if x(t) > 0, then t is not a point of increase of y; that is, there exists δ > 0 such that y(ξ) is constant in [t − δ,t + δ] ∩ R+. This unique pair is (x,y), where

(ii) For any pair (x,y) of functions in D([0,∞),R) satisfying (a) and (b), we have

References

REFERENCES

Andrews, M., Kumaran, K., Ramanan, K., Stolyar, A.L., Vijayakumar, R., & Whiting, P. (2004). Scheduling in a queueing system with asynchronously varying service rates. Probability in the Engineering and Informational Sciences 18: 191217.Google Scholar
Armony, M. & Bambos, N. (1999). Queueing networks with interacting service resources. In Proceedings of the 37th Annual Allerton Conference, pp. 4252.
Bell, S.L. & Williams, R.J. (2001). Dynamic scheduling of a system with two parallel servers in heavy traffic with complete resource pooling: Asymptotic optimality of a continuous review threshold policy. Annals of Applied Probability 11: 608649.Google Scholar
Bramson, M. (1998). State space collapse with applications to heavy traffic limits for multiclass queueing networks. Queueing Systems 30: 89148.Google Scholar
Chen, H. & Mandelbaum, A. (1991). Leontief Systems, RBV's and RBM's. In M.H.A. Davis & R.J. Elliott (eds.), Applied stochastic analysis. Gordon and Breach Science, pp. 143.
Dai, J.G. & Prabhakar, B. (2000). The throughput of data switches with and without speedup. In Proceedings of the INFOCOM'2000, pp. 556564.
Ethier, S.N. & Kurtz, T.G. (1986). Markov process: Characterization and convergence. New York: John Wiley & Sons.
Harrison, J.M. (1998). Heavy traffic analysis of a system with parallel servers: Asymptotic optimality of discrete review policies. Annals of Applied Probability 8: 822848.Google Scholar
Harrison, J.M. & Lopez, M.J. (1999). Heavy traffic resource pooling in parallel-server systems. Queueing Systems 33: 339368.Google Scholar
Kelly, F.P. & Laws, C.N. (1993). Dynamic routing in open queueing networks: Brownian models, cut constraints and resource pooling. Queueing Systems 13: 4786.Google Scholar
Laws, C.N. (1992). Resource pooling in queueing networks with dynamic routing. Advances in Applied Probability 24: 699726.Google Scholar
Mandelbaum, A. & Stolyar, A.L. (2004). Scheduling flexible servers with convex delay costs: Heavy-traffic optimality of the generalized cμ-rule. Operations Research 52(6): 836855.Google Scholar
McKeown, N., Anantharam, V., & Walrand, J. (1996). Achieving 100% throughput in an input-queued switch. In Proceedings of the INFOCOM'96, pp. 296302.
Stolyar, A.L. (2004). MaxWeight scheduling in a generalized switch: State space collapse and workload minimization in heavy traffic. Annals of Applied Probability 14(1): 153.Google Scholar
Tassiulas, L. & Ephremides, A. (1992). Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio network. IEEE Transactions on Automatic Control 37: 19361948.Google Scholar
Teh, Y. & Ward, A. (2002). Critical thresholds for dynamic routing in queueing networks. Queueing Systems 42: 297316.Google Scholar
Van Mieghem, J.A. (1995). Dynamic scheduling with convex delay costs: The generalized cμ rule. Annals of Applied Probability 5: 809833.Google Scholar
Williams, R.J. (1998). An invariance principle for semimartingale reflecting Brownian motions in an orthant. Queueing Systems 30: 525.Google Scholar
Williams, R.J. (1998). Diffusion approximations for open multiclass queueing networks: Sufficient conditions involving state space collapse. Queueing Systems 30: 2788.Google Scholar
Williams, R.J. (2000). On dynamic scheduling of a parallel server system with complete resource pooling. Fields Institute Communications 28: 4971.Google Scholar