Entropy based file type identification and partitioning
Abstract
The need for file identification and partitioning in the digital forensic, reverse engineering, and security analyst fields cannot be overstated. In this research, we investigate the use of the Shannon entropy profile derived from the file expressed in byte format to characterize specific file types and identify file segments based on entropy-level changes. The process consists of two stages. In the first stage, a binary representation of the file is partitioned into chunks of fixed-length data bytes and processed to extract the entropy profile. In the second stage, the detrended fluctuation analysis (DFA) method is applied to determine the level of structure in the entropy profile. The Haar continuous wavelet transform (CWT) is then used to partition the files identified as highly structured into areas of distinct changes in entropy level. Experimental results show that the proposed approach is effective in identifying file types and partitioning in segments of different entropy levels.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2017
- Accession Number
- AD1046497
Entities
People
- Calvin B. Paul
Organizations
- Naval Postgraduate School