REFERENTIAL GROUNDING IN MULTIMODAL MACHINE TRANSLATION
Abstract
This project focuses on multimodal machine translation, a recent field of research where, in addition to textual context, models leverage other modalities, such as images, videos or acoustic information. The motivation is that these modalities will provide richer context, helping ground the meaning of the text and, as a consequence, generate more adequate translations. We propose a new approach to multimodal machine translation where the correspondences between image regions and source (and/or target) words are better defined and can then be used for translation. This is referred to as referential grounding, since the grounding is done at the image region (e.g. object or scene) level, potentially helping to deal with ambiguities.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Aug 11, 2021
- Source ID
- FA86552017006
Entities
People
- Lucia Specia
Organizations
- Air Force Office of Scientific Research
- Imperial College London
- United States Air Force