Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-02-05T23:09:52.490Z Has data issue: false hasContentIssue false

ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON ONE-ARMED-BANDIT PROBLEM

Published online by Cambridge University Press:  07 January 2003

Apostolos N. Burnetas
Affiliation:
Department of Operations, Weatherhead School of Management, Case Western Reserve University, Cleveland, OH 44106-7235, E-mail: atb4@po.cwru.edu
Michael N. Katehakis
Affiliation:
Department of Management Science and Information Systems, Rutgers Business School, Rutgers—The State University of New Jersey, Newark, NJ 07102, E-mail: mnk@rci.rutgers.edu
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

The multiarmed-bandit problem is often taken as a basic model for the trade-off between the exploration and utilization required for efficient optimization under uncertainty. In this article, we study the situation in which the unknown performance of a new bandit is to be evaluated and compared with that of a known one over a finite horizon. We assume that the bandits represent random variables with distributions from the one-parameter exponential family. When the objective is to maximize the Bayes expected sum of outcomes over a finite horizon, it is shown that optimal policies tend to simple limits when the length of the horizon is large.

Type
Research Article
Copyright
© 2003 Cambridge University Press