Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-05T19:59:57.972Z Has data issue: false hasContentIssue false

TRANSMISSION RATE CONTROL OVER RANDOMLY VARYING CHANNELS

Published online by Cambridge University Press:  01 January 2005

Gauresh Rajadhyaksha
Affiliation:
Department of Electrical and Computer Engineering, University of Texas, Austin, TX 78712, E-mail: gaureshr@mail.utexas.edu
Vivek S. Borkar
Affiliation:
School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai 400005 India, E-mail: borkar@tifr.res.in
Rights & Permissions [Opens in a new window]

Abstract

The problem of controlling transmission rate over a randomly varying channel is cast as a Markov decision process wherein the channel is modeled as a Markov chain. The objective is to minimize a cost that penalizes both buffer occupancy (equivalently, delay) and power. The nature of the optimal policy is characterized using techniques adapted from classical inventory control.

Type
Research Article
Copyright
© 2005 Cambridge University Press

1. INTRODUCTION

In a recent article, Goyal, Kumar, and Sharma [4] considered the problem of optimal transmission over a fading channel to minimize mean delays subject to a power constraint. By casting the problem as a constrained Markov decision problem, they obtained structural results for the optimal policy, improving upon the earlier results of Berry and Gallager [1]. They consider an additive Gaussian noise model for the channel and explicitly use the Shannon formula for its information-theoretic capacity in their analysis. Our aim here is to view the problem in a more abstract framework, with the effect of the randomly varying channel on the power required being modeled as a Markov chain. Our key observation is the similarity of this problem with classical inventory control [2,3]. Based on this observation, we are able to recover some structural results for the optimal policy through a fairly simple analysis.

The precise model is described. The finite horizon control problem is analyzed in the next section and the infinite horizon discounted problem is analyzed in Section 3.

Consider buffered packet traffic with the evolution of buffer occupancy denoted by

Here, uk denotes the number of arriving packets and wk ∈ [0,K], K > 0, is the number of packets transmitted in time slot k, with the constraint that wkxkk. (Note that we treat xk and wk as continuous valued variables, a fairly common approximation.) Under the reasonable condition that x0 ≥ 0, this then ensures that {xk} is an

-valued process. The arrivals {uk} are assumed to be independent and identically distributed (i.i.d.) with law ζ supported on [0,M] for a large finite

1

ζ supported on

can be allowed under mild technical conditions.

M > 0, and the transmissions {wk} are required to satisfy the natural “admissibility condition”: wk is independent of uj,jk, and for {c(m)} defined below, it is conditionally independent of c(j), j > k, given c(k) for all k.

The cost at time k will be

Here, h(·) is a continuously differentiable convex increasing function and {c(k)} is an ergodic Markov chain on [0,C] for some C > 0. The first term penalizes high buffer occupancy and, therefore, mean delay. As an example, consider h(·), which takes small values for 0 ≤ xB − δ and rapidly increases thereafter, where B > 0 is the intended buffer capacity and δ > 0 is a small number. This is a “soft” constraint on buffer capacity that approximates the “hard” constraint

The second term in the cost is proportional to the number of packets transmitted in the kth time slot and captures the power cost. The random coefficient c(k) captures the dependence of this cost on the randomly varying channel conditions. Let p(x,dy) denote the transition kernel of the Markov chain {c(k)}.

The third term in (2) is the negative of the “reward” for successful transmission, with

. The latter requirement ensures that at least for sufficiently high buffer occupancy, there will be incentive to transmit. We will simplify (2) by combining the second and the third terms and replace (2) by

with c(k) now taking values in

, instead of [0,C], and p(x,dy) is redefined accordingly. Note that

.

Let α ∈ (0,1] denote the “discount factor” and N ≥ 1. Our objectives will be to minimize the finite horizon cost,

for j = 0 or the infinite horizon discounted cost (for α ∈ (0,1)),

over the admissible controls {wk}. We correspondingly define the “value functions”

where the infima are over admissible controls.

2. THE FINITE HORIZON PROBLEM

Consider the cost (3). For this cost, the dynamic programming equation is

Let y [trie ] xw. Then the above can be rewritten as

We claim that the Jk(·,c)'s are convex for k < N, a fact that will be proved later. Assuming this, we now derive the structural properties of the optimal policy. Let Sk = Sk(c) be the least element of the connected set of unconstrained minimizers of

over y ∈ [−∞,∞] , with +∞ and −∞ being admissible values. Since 0 ∨ (xK) ≤ yx, we have the following:

for Sk ≥ 0,

for Sk < 0,

An optimal policy can therefore be determined by the sequence of extended real-valued numbers (S0,S1,…,SN−1) and is of the form:

for Sk ≥ 0,

for Sk < 0,

We now establish the convexity claim.

Lemma 1: Jk(·,c), k < N, are convex.

Proof: The proof is by backward induction. Set JN(·,·) ≡ 0 in what follows. Consider the policy at k = N − 1. Note that SN−1 = −∞; thus,

and

Thus, since c < 0, we have that JN−1(·,c) is convex. Now, suppose that Jk+1(·,c) is convex. Then we have the following:

for Sk ≥ 0,

for Sk < 0,

where Sk = Sk(c) minimizes

over y ∈ [−∞,∞]. Jk(·,c) is clearly continuous and separately convex on [0,Sk], [Sk,Sk + K], and [Sk + K,∞) when Sk ≥ 0, and on [0,K] and [K,∞) when Sk < 0. Taking the left derivative of the above expression at its minimum Sk and using the property of its slope at Sk,

where Ji(·,c) denotes the left derivative of Ji(·,c), 1 ≤ iN. Thus,

Similarly, taking the right derivative at Sk, we get

where Ji+(·,c) denotes the right derivative of Ji(·,c), 1 ≤ iN. This leads to

For Sk < 0, we can consider the right derivative at y = 0 to conclude

implying

From the expression for Jk(x,c), we get the following:

for Sk ≥ 0,

for Sk < 0,

Recall that h is an increasing convex function. Hence, from (7) and (8) and the above expressions, we have, for Sk ≥ 0,

and

Similarly, for Sk < 0, using (9),

This implies that Jk(·) is convex, which completes the induction step. █

We summarize the results in the following theorem.

Theorem 1: The optimal policy is given byk*(·,·)}.

This leaves the issue of the nature of dependence of Sk(c) on c. Unfortunately, not much can be said in general, but we make a few comments here that may help give some handle on this in specific cases. Note that Sk(c) was the global minimizer of a function of the form −cy + g(c,y) over y ∈ [−∞,∞] . For simplicity, suppose that it is unique and finite for each c. Consider the cross-derivative

assumed to exist and be continuous. If this is greater than 0 (resp., less than 0), it implies “increasing differences” (resp., “decreasing differences”) in the sense of [6, Chap.10]. Note that

for

. Using this and by mimicking the arguments of Theorem 10.7 of [6, p.259], one then has that Sk(c) is increasing (resp., decreasing) in c in a neighborhood of a point where the right-hand side of (13) is < 0 (resp., > 0).

3. THE INFINITE HORIZON PROBLEM

We now analyze the infinite horizon discounted cost problem by treating it as a limiting case of the above as N → ∞. We will assume the following:

(*) There is at least one control policy under which the cost is finite for any initial condition.

As an example, consider the case when K > ∫uζ(du) (i.e., the maximum permissible transmission rate exceeds the mean arrival rate). Then, it is easy to verify that the foregoing will hold for the control choice wk = xK, k ≥ 0. Under (*), one has [5] the dynamic programming equation

In addition, we assume the following:

(**) The kernel p(x,dy) is strong Feller; that is, for any bounded measurable

, ∫f (y)p(x,dy) is continuous in x.

We will also assume that h(·) is Lipschitz continuous with Lipschitz constant L > 0. (This could be relaxed at the expense of greatly enhanced technicalities.) Let JN [trie ] J0,JiN [trie ] Ji, so as to make the dependence on the time horizon N explicit. Define also

for

. Then, {JjN} satisfies the dynamic programming equation

Clearly,

are monotone increasing. Under (*), they are also pointwise bounded. Thus,

is well defined.

Lemma 2: {JiN(·,c),JiN(·,c)} satisfies the Lipschitz condition with a common Lipschitz constant

.

Proof: We will prove that for each i, JiN(·,c) and JiN(·,c) satisfy the Lipschitz condition with a common Lipschitz constant

, which, in turn, implies the lemma. This is trivially true for i = N. The claim for JiN,i < N, follows by a simple backward induction, using the fact that

where the last inequality follows from the induction hypothesis. The claim for {JiN} is immediate from (16). █

Corollary 1: The convergence

is uniform on compacts. Furthermore,

satisfies

Proof: As an increasing limit of continuous functions,

is at least lower semicontinuous. (In fact, it is convex, being the pointwise limit of convex functions and hence continuous on the interior of its domain.) Lemma 2 implies that

will be Lipschitz with a common Lipschitz constant, as earlier. In particular, they are equicontinuous and, hence, for each fixed c, the convergence

is uniform on compacts. The monotone convergence theorem leads to

where the convergence in the x and w variables is uniform on compacts for fixed c in view of the foregoing remarks. In particular, this implies that the corresponding minima over w ∈ [0,xK] converge. Thus, we can let N → ∞ in (5) to conclude that

satisfies (17). By the uniform Lipschitz continuity of

and (**), the expression in square brackets on the right-hand side of (17) is uniformly continuous w.r.t. (x,c) on compacts. Thus, the claim follows from Dini's theorem. █

Corollary 2:

. It is convex in x, uniformly continuous in (x,c), and the least

-valued continuous solution to (14).

Proof: Uniform continuity of

and the fact that it satisfies (17) are proved in Corollary 1. Furthermore, Corollary 1 also implies that

Then, J is a uniformly continuous solution to (14), with J = limN→∞ JN, the limit being uniform on compacts. Since the JN's are convex, so is J. Suppose

satisfies (14). Then,

satisfies (17). Iterating (17) N times and using the nonnegativity of J′, we get J′ ≥ J0N. Letting N → ∞, we have

; hence, J* ≥ J. This completes the proof. █

Let S = S(c) be the least element of the connected set of unconstrained minimizers of

over [−∞,∞] . One argues as in the preceding section to conclude the following theorem.

Theorem 2: The optimal policy is given by

Regarding the dependence on c, remarks analogous to those at the end of the preceding section apply.

Acknowledgments

This work was done while one of the authors (G. R.) was visiting the School of Technology and Computer Science, Tata Institute of Fundamental Research during the summer of 2003.

Research by one of the authors (V. S. B.) was supported in part by grant 2900-IT-1 from the Centre Franco-Indien pour la Promotion de la Recherche Avancee (CEFIPRA).

References

REFERENCES

Berry, R.A. & Gallager, R.G. (2002). Communication over fading channels with delay constraints. IEEE Transactions on Information Theory 48(5): 11351149.Google Scholar
Bertsekas, D.P. (2001). Dynamic programming and optimal control, Vol. 1, 2nd ed. Belmont, MA: Athena Scientific.
Bertsekas, D.P. (2001). Dynamic programming and optimal control, Vol. 2, 2nd ed. Belmont, MA: Athena Scientific.
Goyal, M., Kumar, A., & Sharma, V. (2003). Power constrained and delay optimal policies for scheduling transmission over a fading channel. In INFOCOM 2003.
Hernandez-Lerma, O. & Lasserre, J.-B. (1996). Discrete-time Markov control processes. New York: Springer-Verlag.
Sundaram, R.K. (1996). A first course in optimization theory. Cambridge: Cambridge University Press.