Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation
Abstract
This thesis explores scalable methods for adaptive decision making under uncertainty in stateless environments, where the goal of an agent is to design an experiment, observe the outcome, and plan subsequent experiments so as to achieve a desired goal. Typically, each experiment incurs a large computational or economic cost, and we need to keep the number of experiments to a minimum. Many of such problems fall under the bandit framework, where the outcome of each experiment can be viewed as a reward signal, and the goal is to optimise for this reward, i.e. find the design that maximises this reward.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 01, 2018
- Accession Number
- AD1167996
Entities
People
- Kirthevasan Kandasamy
Organizations
- Carnegie Mellon University