Average Reward Timed Games
Abstract
We consider real-time games where the goal consists, for each player, in maximizing the average reward he or she receives per time unit. We consider zero-sum rewards, so that a reward of + (gamma) to one player corresponds to a reward of - (gamma) to the other player. The games are played on discrete-time game structures which can be specified using a two-player version of timed automata whose locations are labeled by reward rates, Even though the rewards themselves are zero-sum, the games are not, due to the requirement that time must progress along a play of the game. Since we focus on control applications, we define the value of the game to a player to be the maximal average reward per time unit that the player can ensure. We show that, in general, the values to players 1 and 2 do not sum to zero. We provide algorithms for computing the value of the game for either player, the algorithms are based on the relationship between the original, infinite-round game, and a derived game that is played for only finitely many rounds. Memoryless optimal strategies exist for both players in both games,
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2005
- Accession Number
- ADA457215
Entities
People
- B. T. Adler
- Luca De Alfaro
- Marco Faella
Organizations
- University of California, Santa Cruz