Visual Question Answering (VQA)

Abstract

Statement of Work:Develop underlying capabilities for building a semantic-based visual question-answering system that can communicate with humans in natural language.Objective:The goal is to enable machines to understand semantic content in images, and communicate this understanding as effectively as humans via natural language.Approach:The PI proposes to address the problem of Visual Question Answering (VQA). Given an image and a free-form, natural language question about the image, the task is to automatically produce a concise, accurate, free-form, natural language answer. This research is expected to generate new datasets, knowledge, and techniques in pure computer vision, in integrating vision and language, in developing visual common sense, and in interpretable models. Also contributions are expected in training the machine to be curious and actively ask questions to learn, and training the machine to know what it knows and what it does not. Deep learning is a key approach in this proposal. The PI will buildon her pioneering work in developing universal attributes and relative attributes. Another innovative aspect of theproposed research is using drawings and sketches to train the system to recognize subtle differences between similar concepts.Overall Merit and ONR Mission/Relevance:This research addresses ONR~s Information Dominance focus area, as well as Autonomy and Unmanned Systems focus area. This work is expected to advance visual question-answering systems for use by intelligence analysts, as well as enhanced image interpretation capabilities for autonomous agents.This research is expected to develop novel approaches toward building sophisticated semantic-based visual question answering systems.

Document Details

Document Type: DoD Grant Award
Publication Date: Aug 12, 2016
Source ID: N000141612647

Entities

People

Devi Parikh

Organizations

Office of Naval Research
United States Navy
Virginia Tech

Visual Question Answering (VQA)

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas