A Statistical Word-Level Translation Model for Comparable Corpora

Abstract

In this paper, we present a model of statistical word-level mapping for comparable corpora. The approach is based on the assumption that if two terms have close distributional profiles, their corresponding translations' distributional profiles should be close in a comparable corpus. The proposed model is described. A preliminary investigation on intralanguage comparable corpora is laid out. The preliminary results are >92% accurate suggesting the feasibility of the model. The model needs to undergo some improvements and should be tested cross linguistically before assessing its significance.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 01, 2000
Accession Number: ADA455144

Entities

People

Mona Diab
Steve Finch

Organizations

University of Maryland

A Statistical Word-Level Translation Model for Comparable Corpora

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Readers