Sample Entropy and Random Forests: A Methodology for Anomaly-based Intrusion Detection and Classification of Low-bandwidth Malware Attacks

Abstract

Sample Entropy examines changes in the normal distribution of network traffic to identify anomalies. Normalized Information examines the overall probability distribution in a data set. Random Forests is a supervised learning algorithm which is efficient at classifying highlyimbalanced data. Anomalies are exceedingly rare compared to the overall volume of network traffic. The combination of these methods enables low-bandwidth anomalies to easily be identified in high-bandwidth network traffic. Using only low-dimensional network information allows for near real-time identification of anomalies. The data set was collected from 1999 DARPA intrusion detection evaluation data set. The experiments compare a baseline f-score to the observed entropy and normalized information of the network. Anomalies that are disguised in network flow analysis were detected. Random Forests prove to be capable of classifying anomalies using the sample entropy and normalized information. Our experiment divided the data set into five-minute time slices and found that sample entropy and normalized information metrics were successful in classifying bad traffic with a recall of .99 and a f-score .50 which was 185% better than our baseline.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2006
Accession Number
ADA457209

Entities

People

  • Bret M. Hyla

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Cyber
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Anomaly Detection
  • Change Detection
  • Computational Science
  • Computer Communications
  • Computer Network Security
  • Computer Networks
  • Computer Science
  • Cybersecurity
  • Data Mining
  • Detection
  • Intrusion Detection
  • Intrusion Detectors
  • Machine Learning
  • Network Science
  • Operating Systems
  • Probability
  • Probability Distributions

Fields of Study

  • Computer science

Readers

  • Cybersecurity.
  • Neural Network Machine Learning.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • Cyber
  • Cyber - Cryptography