Visual Question Answering (VQA)

Abstract

Statement of Work:Develop underlying capabilities for building a semantic-based visual question-answering system that can communicate with humans in natural language.Objective:The goal is to enable machines to understand semantic content in images, and communicate this understanding as effectively as humans via natural language.Approach:The PI proposes to address the problem of Visual Question Answering (VQA). Given an image and a free-form, natural language question about the image, the task is to automatically produce a concise, accurate, free-form, natural language answer. This research is expected to generate new datasets, knowledge, and techniques in pure computer vision, in integrating vision and language, in developing visual common sense, and in interpretable models. Also contributions are expected in training the machine to be curious and actively ask questions to learn, and training the machine to know what it knows and what it does not. Deep learning is a key approach in this proposal. The PI will buildon her pioneering work in developing universal attributes and relative attributes. Another innovative aspect of theproposed research is using drawings and sketches to train the system to recognize subtle differences between similar concepts.Overall Merit and ONR Mission/Relevance:This research addresses ONR~s Information Dominance focus area, as well as Autonomy and Unmanned Systems focus area. This work is expected to advance visual question-answering systems for use by intelligence analysts, as well as enhanced image interpretation capabilities for autonomous agents.This research is expected to develop novel approaches toward building sophisticated semantic-based visual question answering systems.

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 12, 2016
Source ID
N000141612647

Entities

People

  • Devi Parikh

Organizations

  • Office of Naval Research
  • United States Navy
  • Virginia Tech

Tags

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Computational Linguistics
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Information Retrieval
  • Autonomy