Deep Models of Compositionality and Context

Abstract

The goals of DARPA Communicating with Computers program are to advance the state of the art in text and video analytics to the extent that the machine and the human operator have the same mental model. This requires that the machine be able to understand the human intent, and that can explain back, to the human, in ways that makes use of the context and prior knowledge. Three challenge problems were chosen: Block world, Bio curation and Story generation. Furthermore, an Offeror could choose to address five different task areas: TA1 - apparatus for the block world, TA2 - Elementary Composable Ideas, TA3 - Composition of Ideas that takes context into account, TA4- What to say or Do, and TA5 -- Evaluation. The Offeror s task is to address TA2, TA3 and TA4. The three components of the proposed research are: (i) inducing and grounding ECIs (ii) using deep learning for compositionality and context; and (iii) learning via interaction. The Offeror s view is that Elementary Composable Ideas (ECIs) are the basic building blocks of a human-computer exchange. Given that they are not directly observed in natural settings, and what is observed naturally are low-level signals- mainly through perception and actions from TA1 The Offeror proposes to develop methods that can infer and ground ECIs based on input from TA1 partners, to the world in a way that generalizes Deep learning. To compose elementary ideas, the Offeror will use Deep learning-fitting expressive neural networks to large amounts of data-which significantly reduces the need for feature engineering. These techniques have had major successes in areas such as object recognition and speech recognition, and more recently, NLP. While deep learning might seem counter to logic-based methods for precise natural language understanding, the Offeror will show that they can be not only integrated, but that their strength in modeling context makes them serve a central representational role in both semantics and discourse. For the TA4-- the Offeror proposes to integrate models of non-verbal communication -gestures and spoken prosody extracted from video-to improve recognition of human communicative acts and affect. Also, the Offeror will integrate deep learning and knowledge-based methods to build causal models to help understand human dialogue moves: why the human interlocutor said what they said, which is especially important for collaborative composition. They will also use human-computer interactions to learn from humans, creating a system that can assess its own confidence and proactively ask the human for clarifications. Finally, the Offeror s work will be evaluated in the context of the entire program. Date ARO Abstract

Document Details

Document Type
DoD Grant Award
Publication Date
Jan 12, 2017
Source ID
W911NF1510462

Entities

People

  • Percy Liang

Organizations

  • Army Contracting Command
  • Defense Advanced Research Projects Agency
  • Stanford University

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Artificial Intelligence
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks