REFERENTIAL GROUNDING IN MULTIMODAL MACHINE TRANSLATION

Abstract

This project focuses on multimodal machine translation, a recent field of research where, in addition to textual context, models leverage other modalities, such as images, videos or acoustic information. The motivation is that these modalities will provide richer context, helping ground the meaning of the text and, as a consequence, generate more adequate translations. We propose a new approach to multimodal machine translation where the correspondences between image regions and source (and/or target) words are better defined and can then be used for translation. This is referred to as referential grounding, since the grounding is done at the image region (e.g. object or scene) level, potentially helping to deal with ambiguities.

Document Details

Document Type
DoD Grant Award
Publication Date
Aug 11, 2021
Source ID
FA86552017006

Entities

People

  • Lucia Specia

Organizations

  • Air Force Office of Scientific Research
  • Imperial College London
  • United States Air Force

Tags

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computational Linguistics
  • Computer Vision.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation