A Systematic Study of Learning From Failure
Abstract
Developing robots that can learn from (limited) past experiences to gain high-level human-likedecision intelligence is essential tow,ards their applications in complex environments, especiallyfor many modern DoD systems whose desired decision cycles are inevitably,short. One importantlimitation of the existing robotic systems is the lack of learning from failure capability, which isone unique, feature in humans high-level decision intelligence. One key challenge is the design ofbasic principles and algorithms for robots t,o learn from limited (yet valuable) failure experiencesthat are mostly ignored in the current robotic decision making strategies. An,other key challenge isto address the data efficiency and robustness issue such that non-expert users can guide roboticdecision learn,ing based on their (perhaps) limited domain knowledge. The goals of the project areto overcome the two challenges via developing a n,ew learning from failure approach thatfocuses on providing theoretical and algorithmic solutions of reward and policy learning whe,n thevalue of failure is explicitly harnessed, and then developing new data-efficient and noise-resilientreward and policy learning,approaches. Specifically, the project will focus on four essentialthrusts: (1) Multi-Class Reward Learning: learn reward functions f,rom multiple classes ofexperiences, including failure, success, and others; (2) Failure-Guided Policy Learning: createnew reinforcem,ent learning algorithms that can learn control policies from failure; (3) DataEfficiency and Robustness/Sensitivity: study the value, of data and data quantity/quality on thecontrol policy learning from failure; and, (4) Testing and Evaluation: conduct case studies, toverify and evaluate the proposed methods and algorithms in both simulated and real-worldenvironments. The success of the project,is expected to fulfill the needs at the U.S. Navy byproviding human-like intelligent and robust robotic systems that can learn from,both success andfailure. The novelty of this project is the synthesis of approaches from artificial intelligence,computational, lear,ning, decision/control sciences to create a new learning from failureapproach that offers diversity, efficiency, robustness, and r,esponsiveness.Approved for Public Release
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 08, 2022
- Source ID
- N000142212474
Entities
People
- Yongcan Cao
Organizations
- Office of Naval Research
- United States Navy
- University of Texas at San Antonio