Online Learning of Markovian Systems with Censored Poisson Arrivals

Abstract

This thesis deals with online optimization of discrete performance measures in Markovian models with incomplete information. We consider a setting where a physical realization of the model is sequentially obtained over a number of periods. The information gathered to date is used in order to efficiently run the model in future days. The information is incomplete in two ways: (i) model parameters are initially unknown (the demand rates in our case), but can be estimated from the physical realizations; and (ii), the demands are censored when the system is in some boundary states. The method of Sample Average Approximation is used to solve the optimization problem. More precisely, in each period, sample paths are generated from the distributions estimated to date, and the best model configuration is determined with respect to these sample paths. Sequential observation of the systems behavior allows for information to be gathered and a more informed decision to be made in each future round. The method developed in this thesis can be applied in a variety of contexts, where no information is known about the system beforehand, but can be observed at least partially in a sequential manner, such as assigning assets for surveillance of remote geographical regions for illicit activity. The motivating setting of this work is the operation of a bike-sharing system with fixed capacity stations, where an initial number of bikes must be set each day to minimize unsatisfied customers.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 01, 2021
Accession Number: AD1150967

Entities

People

Cedric G. Gibbons Mac-lean

Organizations

Naval Postgraduate School

Online Learning of Markovian Systems with Censored Poisson Arrivals

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers