Military Stochastic Scheduling Treated As a 'Multi-Armed Bandit' Problem
Abstract
A Blue airborne force attacks a region defended by a single Red surface-to-air missile system (SAM). Red is uncertain about the Blues he faces, but is able to learn about them during the engagement. Red's objective is to develop a policy for shooting at the Blues to maximize the value of Blues shot down before he himself is destroyed. We show that index policies are optimal for Red in a range of scenarios and yield effective heuristics more generally. The quality of such index heuristics is confirmed in a computational study.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2001
- Accession Number
- ADA395044
Entities
People
- Donald P. Gaver Jr.
- Kevin D. Glazebrook
- Patricia A. Jacobs
Organizations
- Naval Postgraduate School