A Survey of Statistical Machine Translation

Abstract

Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2007
Accession Number
ADA466330

Entities

People

  • Adam Lopez

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Ground and Sea Platforms

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Automated Speech Recognition
  • Cognitive Science
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Grammars
  • Hidden Markov Models
  • Language
  • Linguistics
  • Machine Translation
  • Mathematical Models
  • Natural Language Processing
  • Probabilistic Models
  • Probability

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks