Centering Resonance Analysis: A Superior Data Mining Algorithm for Textual Data Streams
Abstract
Report developed under STTR contract for topic AF03T011 of STTR Program Solicitation FY 2003. Current knowledge-based intelligence systems do not perform well with streaming media because of performance shortcomings and an inability to work in storage constrained environment. The purpose of this research was to demonstrate that Centering Resonance Analysis (CRA) provides a superior approach to performing text mining under storage constraints. CRA is a radically different approach to modeling text compared to traditional word frequency-based approach. The project demonstrated that a CRA-based approach is superior to a word frequency approaches: up to 15 times better in identifying relevant documents, and up to 5 times greater precision in topic tracking experiments. A CRA data structure requires one-third the space required for raw compressed text, and can execute on a typical desktop computer. Future R&D efforts will focus on commercializing a product with applications to government and commercial business processes.
Document Details
- Document Type
- Technical Report
- Publication Date
- Mar 24, 2004
- Accession Number
- ADA422048
Entities
People
- Dan Ballard
- Kevin Dooley
- Steven R Corman