Multi-Class Classification for Identifying JPEG Steganography Embedding Methods

Abstract

Over 725 steganography tools are available over the Internet, each providing a method for covert transmission of secret messages. This research presents four steganalysis advancements that result in an algorithm that identifies the steganalysis tool used to embed a secret message in a JPEG image file. The algorithm includes feature generation, feature preprocessing, multi-class classification and classifier fusion. The first contribution is a new feature generation method which is based on the decomposition of discrete cosine transform (DCT) coefficients used in the JPEG image encoder. The generated features are better suited to identifying discrepancies in each area of the decomposed DCT coefficients. Second, the classification accuracy is further improved with the development of a feature ranking technique in the preprocessing stage for the kernel Fisher s discriminant (KFD) and support vector machines (SVM) classifiers in the kernel space during the training process. Third, for the KFD and SVM two-class classifiers a classification tree is designed from the kernel space to provide a multi-class classification solution for both methods. Fourth, by analyzing a set of classifiers, signature detectors, and multi-class classification methods a classifier fusion system is developed to increase the detection accuracy of identifying the embedding method used in generating the steganography images. Based on classifying stego images created from research and commercial JPEG steganography techniques, F5, JP Hide, JSteg, Model-based, Model-based Version 1.2, OutGuess, Steganos, StegHide and UTSA embedding methods, the performance of the system shows a statistically significant increase in classification accuracy of 5%. In addition, this system provides a solution for identifying steganographic fingerprints as well as the ability to include future multi-class classification tools.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2008
Accession Number
ADA484934

Entities

People

  • Benjamin M. Rodriguez Ii

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Cyber
  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Bayesian Networks
  • Computational Science
  • Computer Science
  • Cryptography
  • Data Mining
  • Data Science
  • Databases
  • Detection
  • Detectors
  • Factor Analysis
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Network Science
  • Supervised Machine Learning

Fields of Study

  • Computer science
  • Engineering

Readers

  • Database Systems and Applications
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks
  • Space
  • Space - Space Objects