Statistical Analysis of the Skaion Network Security Dataset

Abstract

This thesis considers the best use of network traffic data to increase cyber security. This operational problem is one of great concern to network administrators and users generally. Our specific task was performed for the Network Security Division of the Army Research Laboratory, who requested we analyze a dataset of cyber-attacks masked in a background of representative network traffic (the "Skaion" dataset). We find that substantial preprocessing must done before statistical methods can be applied to raw network data, that no single predictor is sufficient, and that the most effective statistical analysis is logistic regression. Our approach is novel in that we consider not only single predictor events, but also combinations of reports from multiple tools. While we consider a number of different statistical techniques, we find that the most satisfactory model is based on logistic regression. Finally, we conclude that while the Skaion dataset is valuable in terms of its new approach to network traffic emulation, the 1999 KDD Cup and DARPA-MIT datasets-despite their many shortcomings-are more clearly organized and accessible to academic study. Cyber security is a globally important problem and datasets like Skaion's must maximize opportunities for cross-disciplinary academic endeavors.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2012
Accession Number
ADA570792

Entities

People

  • William F. Major Jr.

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Cyber
  • Energy and Power Technologies
  • Engineered Resilient Systems
  • Human Systems

DTIC Thesaurus Topics

  • Anomaly Detection
  • Application Protocols
  • Computer Network Security
  • Computer Networks
  • Computer Programming
  • Cyberattacks
  • Cyberspace Operations
  • Data Mining
  • Detectors
  • Information Science
  • Intrusion Detectors
  • Machine Learning
  • Military Research
  • Network Protocols
  • Network Science
  • Operating Systems
  • Statistical Analysis

Fields of Study

  • Computer science

Readers

  • Cybersecurity.
  • Neural Network Machine Learning.
  • Regression Analysis.

Technology Areas

  • Cyber