No-Regret Algorithms for Structured Prediction Problems

Abstract

No-regret algorithms are a popular class of online learning rules. Unfortunately, most no-regret algorithms assume that the set Y of allowable hypotheses is small and discrete. Instead, the authors consider prediction problems where Y has internal structure: Y might be the set of strategies in a game like poker, the set of paths in a graph, or the set of configurations of a data structure like a rebalancing binary search tree; or Y might be a given convex set (the "online convex programming" problem), or, in general, an arbitrary bounded set. They derive a family of no-regret learning rules, called Lagrangian Hedging algorithms, to take advantage of this structure. Their algorithms are a direct generalization of known no-regret learning rules, like weighted majority and external-regret matching. In addition to proving regret bounds, they demonstrate one of their algorithms learning to play one-card poker.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 21, 2005
Accession Number
ADA456058

Entities

People

  • Geoffrey J. Gordon

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • C4I
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Computer Programming
  • Computer Science
  • Convex Programming
  • Convex Sets
  • Distance Learning
  • Equations
  • Extensive-Form Games
  • Guarantees
  • Hypotheses
  • Matrix Games
  • Optimization
  • Probability
  • Probability Distributions
  • Theorems
  • Trees (Data Structures)
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Operations Research