Deep Learning of Passage Structure for Scalable Semantic Discovery

Abstract

This e ort will take advantage of recent advances in deep learning and related technologies that place large, natural language datasets into a lower dimension, shared semantic vector space, which we call an embedding space. Substantial prior work has investigated word- and phrase-based embedding spaces. There is a notable gap, however, in semantic coverage that exists at the passage and document levels. This gap is impeding semantic search capabilities for large text corpora. Our rst objective is to develop a technique to represent the content of a very large natural language corpus by a lower dimension hash-like index using deep learning models to extract semantically rich embeddings that represent the rhetorical structure of paragraphs. Second, we aim to create embeddings that are both e ective for very large scale, near real-time discovery, and rich enough to allow a variety of composable semantic queries that apply across a mix of data scales. This work has direct relevance to many Navy objectives, as they have similar needs to organize and understand large sets of textual information. This work will produce (1) an accurate semantic representation of the rhetorical content of complex unstructured text in a lower dimensional space, (2) hierarchical representations of the content of complex unstructured text in a semantic space that can be shared with, for example, imagery data, (3) a exible, powerful semantic query capability for discovery within the semantic space, and (4) fast, automated labeling and grouping of complex, unstructured text.

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 12, 2016
Source ID
N000141512386

Entities

People

  • Matthew S. Gerber

Organizations

  • Office of Naval Research
  • United States Navy
  • University of Virginia

Tags

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms
  • Space