Multimodal Decompositional Semantics

Abstract

Johns Hopkins University, partnering with the University of Rochester, pursued research and development of analytics in support of a larger framework for knowledge-driven hypothesis testing. We leveraged our expertise in dataset creation for decompositional semantics, to develop new datasets specifically geared towards the extraction problems of the DARPAs Active Interpretation of Disparate Alternatives (AIDA) program (specifically in event extraction and coreference resolution). Notable examples of results from our team include: the construction of RAMs, the first publicly available multi-sentence event extraction dataset; the development of state of the art multilingual coreference models, including an online variant that handled long documents with a fixed amount of memory, as well as a new multilingual dataset that focused on multi-person dialogues; a new supervised approaches to cross-lingual alignment, supporting the automatic creation of training data through projecting from English to less-resourced languages; a framework for sentence-level paraphrasing and data augmentation; collaborations on the emerging science of probing neural language models; and the development of new decompositional resources and analysis across a number of new linguistic dimensions. In the initial phase of the program, we provide analytic outputs as part of the program-wide evaluation run by NIST (focus on multilingual text and speech). In the second phase we provided fewer components, focusing exclusively on text. In the third phase our focus was on data annotation under a newly proposed claim frame task, which exercised our background in crowd-sourcing rich linguistic annotations.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 08, 2023
Accession Number
AD1195012

Entities

People

  • Aaron White
  • Benjamin Durme
  • Kyle Rawlins

Organizations

  • Johns Hopkins University

Tags

Communities of Interest

  • Cyber
  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automatic
  • Classification
  • Coders
  • Computational Linguistics
  • Computational Science
  • Construction
  • Decoding
  • Extraction
  • Generative Models
  • Identification
  • Language
  • Linguistics
  • Machine Translation
  • Models
  • Natural Language Processing
  • Natural Languages
  • Ontologies
  • Semantics
  • Standards

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development