Neuro-Symbolic Reinforcement Learning Under Perceptual Limitations
Abstract
Approved for Public ReleaseThe objective of this proposal is to develop theory and algorithms for neuro-symbolic reinforcement learn,ing for autonomous agents to distill complex perceptual data into verifiably safe strategies and demonstrateempirically and theoret,icallythe impact of the resulting algorithms on data efficiency, transfer, and generalization in learning. The effort will establis,h a novel bridge between deep reinforcement learning and automata learning while accounting for perceptual limitations. It is struct,ured into three complementing thrusts:Thrust I: Neuro-symbolic reinforcement learning The first thrust will significantly extend a, recent merger between reinforcement learning and automata-theoretic constructs to operate in infinite-state and infinite-action set,tings through function approximations based on deep neural networks.Thrust II: Neuro-symbolic reinforcement learning under perceptua,l limitations The second thrust focuses on accounting for perceptual limitations. We will develop principled methods to translate,perceptual uncertainty into a Bayesian belief distribution over symbols that enter the neuro-symbolic representations of strategies., The resulting reinforcement learning algorithms will then account for the ambiguity in the underlying temporal logic specifications, or automata that encode advice and are maximally compatible with the ambiguous interpretations of the agent s execution.Thrust III:, Transfer and generalization in neuro-symbolic reinforcement learning By distilling complex perceptual inputs into underlying symb,olic concepts, the proposed framework will allow reinforcement learning agents to transfer general-purpose symbolic knowledge as wel,l as data-driven experience to novel environments.The potential outcomes of this research will include algorithms, computational too,ls, and simulation environments that will help develop data-efficient, interpretable decision-making agents that exploit high-level,logical specifications for fast learning in naval missions. We intend that the software tools and algorithms developed from this eff,ort will broadly be applicable to many naval missions, including patrolling, navigation and surveillance tasks that require coordina,tion between naval assets amidst the uncertain intent of other agents.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Apr 01, 2022
- Source ID
- N000142212254
Entities
People
- Sandeep Chinchali
Organizations
- Office of Naval Research
- United States Navy
- University of Texas at Austin