Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation

Abstract

This thesis explores scalable methods for adaptive decision making under uncertainty in stateless environments, where the goal of an agent is to design an experiment, observe the outcome, and plan subsequent experiments so as to achieve a desired goal. Typically, each experiment incurs a large computational or economic cost, and we need to keep the number of experiments to a minimum. Many of such problems fall under the bandit framework, where the outcome of each experiment can be viewed as a reward signal, and the goal is to optimise for this reward, i.e. find the design that maximises this reward.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 2018
Accession Number
AD1167996

Entities

People

  • Kirthevasan Kandasamy

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Bayesian Networks
  • Computational Fluid Dynamics
  • Computational Science
  • Computer Programming
  • Computers
  • Data Mining
  • Data Science
  • Databases
  • Information Processing
  • Information Science
  • Information Systems
  • Monte Carlo Method
  • Multiobjective Optimization
  • Network Science
  • Neural Networks
  • Surveys
  • Three Dimensional

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Neural Network Machine Learning.