II.A.1.c.i(4) Intelligent Systems: Bootstrap Learning Using Inductive Logic Programming and Human Preferences

Abstract

Statement of Scientific Objectives: The context for our proposal is to develop intelligent autonomous agents having seamless synergies with humans in order to achieve complex tasks in battlefields and cyber-physical systems. Reinforcement learning (RL) has become the most popular tool for teaching new skills to machines. However, RL frameworks inherently require a lot of data to train. Thus, human-assisted RL has been leveraged in the past to improve the learning performance. Unfortunately, human-assisted machine learning, when combined with the need for RL to have access to rich reward functions, raises some significant challenges. This includes the conflict between the need to increase the richness of the reward function while minimizing the burden placed on the human to generate the rewards. This proposal explores an integrated solution paradigm that will allow humans to assist machine-learning algorithms by both substantially increasing the richness of the feedback and incorporating human background knowledge as well as preferences into the learning, while not severely burdening the human-in-the-loop. Specifically, it introduces a novel framework for Relational Reinforcement Learning (RRL) based on differentiable Inductive Logic Programming (ILP). This framework provides a relational representation of the environment in terms of predicate logic, and learning the actions and policies by exploiting those relations. The research also leverages intrinsic event-related potentials in EEG brain signals that will allow humans to assist learning algorithms, without burdening the human-in-the-loop. Intellectual Merits: The research provides a natural platform for assisting machine learning algorithms, without burdening human-in-the-loop, by a direct integration of background knowledge, human preferences, and inductive biases into reinforcement learning models. The research will introduce a framework for Relational Reinforcement Learning (RRL) using our recent work on differentiable ILP. Unlike traditional RL models in which states are extracted from features of raw images, our ILP-based RRL constructs relational representation for describing states, and learns policies by inducing logical relationships from the environment and agent actions. In addition, we propose to use EEG as intrinsically generated human feedback in RRL for training, without relying on rich predefined reward functions designated by humans. We plan to leverage various EEG action-potentials, representing natural reactions in humans, as implicit feedback modalities in the learning framework to make machines more intelligent. It is expect that the proposed research outcomes to be significant in five areas: (i) For training of reinforcement learning algorithms that do not have a rich reward function, learning from human preferences in the form of implicit EEG-based feedback is expected to accelerate the training phase of the autonomous agent. In other words, if the environment reward functions are otherwise non-existent, sparse or difficult to define and generate, tapping into the intrinsic human observer s brain waves provides easy access to a reward function. (ii) If the training requires human input anyway (e.g. the algorithm is required to perform preference-driven exploration), the ILP based RRL framework can encode such requirements in the form of first order logic (i.e., predicate language). (iii) Inductive logic programming provides a natural mechanism to directly incorporate expert prior knowledge, in the form of background rules, in relational reinforcement learning, and hence speeding up the training time and reducing the required data samples. (iv) Since our ILP method is differentiable, unlike the classical RRL, the proposed ILP-based RRL framework can be trained using end-to-end optimization. The learned policy in our ILP based RRL is human interpretable, and hence can be viewed, verified and modified by the expert observer.

Document Details

Document Type: DoD Grant Award
Publication Date: Jul 28, 2023
Source ID: W911NF2310146

Entities

People

Faramarz Fekri

Organizations

Army Contracting Command
Georgia Tech Research Corporation
United States Army

II.A.1.c.i(4) Intelligent Systems: Bootstrap Learning Using Inductive Logic Programming and Human Preferences

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas