Neuro-Symbolic Compositional Generalization for Language and Vision Comprehension and Grounding

Abstract

We develop a neuro-symbolic framework for imparting explicit reasoning and compositional learning to large vision and language models (VLMs). We propose a principled and integrated approach to impart compositional reasoning capabilities, including spatial, temporal, part-whole reasoning to neural representations by incorporating symbolic layers of reasoning in gigantic transformer-based architectures and interactive language grounding. We address the compositional generalization in a principled way inspired by human cross-situational learning of basic concepts and their compositions and study the formal and functional properties of concept composition. Via neurosymbolic modeling, we exploit the current gigantic transformer-based architectures that convey implicit world knowledge and equip them with symbolic and explicit world knowledge to improve their generalization and reasoning. Moreover, we propose an interactive setting between human and agent to address the issue of compositional grounding. We use performance tasks of interactive instruction following agents in realistic environments, open domain visual question answering, and knowledge-based visual question answering, and evaluate our technical contribution accordingly. We equip our existing framework for the integration of knowledge in statistical and deep learning (i.e. DomiKnowS) with the techniques developed in this proposed research. Approved for Public Release

Document Details

Document Type: DoD Grant Award
Publication Date: May 15, 2023
Source ID: N000142312417

Entities

People

Parisa Kordjamshidi

Organizations

Michigan State University
Office of Naval Research
United States Navy

Neuro-Symbolic Compositional Generalization for Language and Vision Comprehension and Grounding

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas