Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation

Abstract

This thesis explores scalable methods for adaptive decision making under uncertainty in stateless environments, where the goal of an agent is to design an experiment, observe the outcome, and plan subsequent experiments so as to achieve a desired goal. Typically, each experiment incurs a large computational or economic cost, and we need to keep the number of experiments to a minimum. Many of such problems fall under the bandit framework, where the outcome of each experiment can be viewed as a reward signal, and the goal is to optimise for this reward, i.e. find the design that maximises this reward.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Oct 01, 2018
Accession Number: AD1167996

Entities

People

Kirthevasan Kandasamy

Organizations

Carnegie Mellon University

Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers