Multilingual Automatic Document Classification, Analysis and Translation (MADCAT)
Abstract
(U) The Multilingual Automatic Document Classification, Analysis and Translation (MADCAT) program will develop and integrate technology to enable exploitation of captured, foreign language, hard-copy documents. This technology is crucial to the warfighter, as hard-copy documents including notebooks, letters, ledgers, annotated maps, newspapers, newsletters, leaflets, pictures of graffiti, and document images (e.g., PDF files, JPEG files, scanned TIFF images, etc.) resident on magnetic and optical media captured in the field may contain important, but perishable information. Unfortunately, due to limited human resources and the immature state of applicable technology, the Services lack the ability to exploit in a timely fashion ideographic and script documents that are either machine printed or handwritten in Arabic. The MADCAT program will address this need by producing devices that will convert such captured documents to readable English in the field. MADCAT will substantially improve the applicable technologies, in particular document analysis and optical character recognition/optical handwriting recognition (OCR/OHR). MADCAT will then tightly integrate these improved technologies with translation technology and create demonstration prototypes for field trials.
Document Details
- Document Type
- Accomplishment
- Publication Date
- Oct 01, 2011
- Source ID
- 51e1a045fb93fd4f0db5a99fb4c9f178