Multilingual Automatic Document Classification, Analysis and Translation (MADCAT)

Abstract

(U) The Multilingual Automatic Document Classification, Analysis and Translation (MADCAT) program will develop and integrate technology to enable exploitation of captured, foreign language, hard-copy documents. This technology is crucial to the warfighter, as hard-copy documents including notebooks, letters, ledgers, annotated maps, newspapers, newsletters, leaflets, pictures of graffiti, and document images (e.g., PDF files, JPEG files, scanned TIFF images, etc.) resident on magnetic and optical media captured in the field may contain important, but perishable information. Unfortunately, due to limited human resources and the immature state of applicable technology, the Services lack the ability to exploit in a timely fashion ideographic and script documents that are either machine printed or handwritten in Arabic. The MADCAT program will address this need by producing devices that will convert such captured documents to readable English in the field. MADCAT will substantially improve the applicable technologies, in particular document analysis and optical character recognition/optical handwriting recognition (OCR/OHR). MADCAT will then tightly integrate these improved technologies with translation technology and create demonstration prototypes for field trials.

Document Details

Document Type
Accomplishment
Publication Date
Oct 01, 2011
Source ID
51e1a045fb93fd4f0db5a99fb4c9f178

Tags

Readers

  • Computer Vision.
  • Database Systems and Applications
  • Library and Information Science

Related Documents