The Linguistic Core Approach to Structured Translation and Analysis of Low Resource Languages

Abstract

The Linguistic Core MURI project focused on machine translation (MT) and textual analysis (TA) engines for low resource languages. We produced systems that can be trained with less data by using knowledge-rich linguistic priors, linguistic corpus annotation, monolingual corpora, techniques for cross-lingual training of NLP systems, and compact representations that allow for generalization over small amounts of data. Our research activities ranged from data collection and annotation to the design, development, and evaluation of algorithms and models for text analysis and machine translation. Our work addressed three focus languages from Africa(Kinyarwanda, Malagasy, and Swahili), but we also piloted many techniques on a variety of other languages. This report covers work that was done in the five years of the project and the sixth year extension.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Sep 02, 2017
Accession Number: AD1051063

Entities

Organizations

Carnegie Mellon University

The Linguistic Core Approach to Structured Translation and Analysis of Low Resource Languages

Abstract

Document Details

Entities

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas