The Linguistic-Core Approach to Structured Translation and Analysis of Low-Resource Languages

Abstract

The Linguistic Core MURI project focused on machine translation (MT) and textual analysis (TA) engines for low resource languages. We produced systems that can be trained with less data by using knowledge-rich linguistic priors, linguistic corpus annotation, monolingual corpora, techniques for cross-lingual training of NLP systems, and compact representations that allow for generalization over small amounts of data. Our research activities ranged from data collection and annotation to the design, development, and evaluation of algorithms and models for text analysis and machine translation. Our work addressed three focus languages from Africa (Kinyarwanda, Malagasy, and Swahili), but we also piloted many techniques on a variety of other languages. This report covers work that was done in the five years of the project and the sixth year extension

Document Details

Document Type: DoD Grant Award
Publication Date: Jun 25, 2021
Source ID: W911NF1010533

Entities

People

Jaime Carbonell

Organizations

Army Contracting Command
Carnegie Mellon University
United States Army

The Linguistic-Core Approach to Structured Translation and Analysis of Low-Resource Languages

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas