Knowledge-Guided Scene Graph Generation and Reasoning for Visual Understanding

Abstract

Robust and comprehensive visual scene understanding could facilitate complex outdoor missions including military operations, by interpreting the attributes, contents, compositions, and object interactions of outdoor natural scenes. The enhanced scene understanding capability will also benefit numerous intelligent systems in our daily life. Nowadays, the increasingly deployed visual intelligent systems urge the development of more effective and transformative solutions to visual scene understanding. Scene graph generation methods have obtained promising performance on visual understanding. Existing work, however, can hardly obtain semantically-rich scene graphs but only capture limited information behind visual scenes. The gap between scene-specific visual cues and commonsense knowledge prevents the applicability and generalizability of scene graph based methods, especially for open domains and changing environments. On the other hand, the well-established commonsense knowledge graphs encode how the world is structured and how general concepts interact, and the online encyclopedia like Wikipedia provides comprehensive textual descriptions for real-world entities. They inspire us to transfer knowledge from external knowledge sources to scene graphs. Generating semantically-rich scene graphs with the help of external knowledge sources has not been sufficiently studied before. In this project, we propose a novel framework for knowledge-guided visual scene understanding, by generating, enriching and reasoning over scene graphs with the help of external knowledge. The technical merit of the proposed framework is that it can well exploit external commonsense knowledge to assist scene understanding in open visual domains or changing environments. The proposed framework achieves knowledge-guided scene understanding from three unique perspectives, including scene graph enrichment with external knowledge, scene graph reasoning with novel categories in open domains, and continual scene graph learning in dynamic environments. These new methodologies will contribute to robust and comprehensive understanding of visual scenes, which can deal with challenges such as the lack of commonsense knowledge and imbalanced data distributions. The proposed research will benefit broad research and education community for visual intelligent systems and public safety. The knowledge-guided scene graph generation and reasoning approaches proposed in this project will not only improve the accuracy of scene understanding, but also reduce human efforts under various surveillance environments. Consequently, the proposed techniques will help prevent potential threatens in reality. Moreover, they could be broadly applied and deployed in any visual intelligent systems and autonomous agents, such as video surveillance system in transportation, smart camera systems, and drones in battlefields. The outreach will strengthen interest in science and engineering careers of young scholars and cultivate the desire to perform higher level education and research. The success of this research will reveal new understanding about security and defense-oriented visual understanding in the open and dynamic environments, and contribute timely to the accomplishment of the Army s mission.

Document Details

Document Type
DoD Grant Award
Publication Date
Jun 25, 2021
Source ID
W911NF2110109

Entities

People

  • Sheng Li

Organizations

  • Army Contracting Command
  • The University of Georgia
  • United States Army

Tags

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Distributed Systems and Data Platform Development

Technology Areas

  • Autonomy