Multi-Agent Task Assignment in the Bandit Framework

Abstract

We consider a task assignment problem for a fleet of UAVs in a surveillance/search mission. We formulate the problem as a restless bandits problem with switching costs and discounted rewards: there are N sites to inspect, each one of them evolving as a Markov chain, with different transition probabilities if the site is inspected or not. The sites evolve independently of each other. The problem is known to be PSPACE-hard. We present a systematic method, inspired from the work of Bertsimas and Nino-Mora on restless bandits, for deriving a linear programming relaxation for such locally decomposable MDPs. The relaxation is computable in polynomial-time offline, provides a bound on the achievable performance, as well as an approximation of the cost-to-go which can be used online in conjunction with standard suboptimal stochastic control methods. In particular, the one-step look ahead policy based on this approximate cost-to-go reduces to computing the optimal value of a linear assignment problem of size N. We present numerical experiments, for which we assess the quality of the heuristics using the performance bound.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Dec 15, 2006
Accession Number: AD1004476

Entities

People

Jerome L. Ny
Munther Dahleh
Éric Féron

Organizations

Massachusetts Institute of Technology

Multi-Agent Task Assignment in the Bandit Framework

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers