A multi-armed bandit approach to superquantile selection

Abstract

We study a resource allocation problem in an intelligence setting. The intelligence cycle is comprised of three phases: collection,processing, and analysis. Enhanced efficiency within the first two stages directly impacts the number and types of important items thatare considered by analysts, increasing the frequency of the most important documents that are reviewed. The dilemma here is that ananalyst needs to quickly determine which sources to investigate, in order to provide meaningful analysis to a request for information with, a concrete deadline. Initially, the value of each source is unknown; so, too, is the probabilistic nature of the value derived from each item. Generally, more sources and documents are available to be considered within a limited time frame than could be overanalyzed, compounding the complexity of this problem. Our goal is to efficiently find the source that produces the largest fraction of relevant items with respect to a request for information. By "efficiently," we mean that the analyst balances exploration versus exploitation of the different sources judiciously. As such, the theoretical framework for this problem is that of a multi-armed bandit, a classic iterative decision learning process. This thesis presents a new approach to identifying the optimal arm(s) of a multi-arm bandit with the largest or smallest quantile or superquantile risk, under a loss constraint. This problem is not only important in intelligence applications, but in marketing and finance. We extend the existing theoretical framework of dealing with quantiles to a novel situation with estimators of conditional expectations over an unknown quantile. Two sequential elimination algorithms are developed that select the most important source for a given constraint level, sampling from the arm(s) with the largest conditional expectation over a quantile.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 01, 2017
Accession Number: AD1046403

Entities

People

Adam J. Hepworth

Organizations

Naval Postgraduate School

A multi-armed bandit approach to superquantile selection

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers