Foundations of Sequential Learning
Abstract
This report summarizes the research done under FA8750-16-2-0173. This research advanced understanding of bandit algorithms and exploration in Markov Decision Processes (MDPs). New algorithms and theory were proposed for bandits with periodic payoff multipliers and arms with costs. Exploration and transfer learning algorithms were evaluated for MDPs.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 2018
- Accession Number
- AD1047509
Entities
People
- Cynthia Rudin
- Kamesh Munagala
- Ronald Parr
Organizations
- Duke University