Video to Text (V2T) in Wide Area Motion Imagery
Abstract
In this project, the Intelligent Fusion Technology, Inc. (IFT) team has developed a hard (video) and soft (text, voice chat) information fusion approach to automatically generate videos with annotation that can be easily used by future human or machine users. The tracking results following the standard format (.kw18) will also be output in a separated file for interfacing with other modules in the E2AT system integration. In the implementation, each entity corresponds to one tracklet with a unique track ID. Each entity consists of two sets of attributes: common attributes and uncommon attributes. Common attributes are those which will not change over the lifetime of a target track like type and color of the target. Uncommon attributes are those changing over time like target location, direction, and activity. The same sets of attribute definitions are used for entities constructed from both hard and soft data. The association, linkage, fusion, and concatenation can improve the visual tracking results through multi-intelligence information fusion.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2015
- Accession Number
- ADA625901
Entities
People
- Dan Shen
- Genshe Chen
- Haibin Ling