Analysis of the Impact of Data Normalization on Cyber Event Correlation Query Performance

Abstract

A critical capability required in the operation of cyberspace is the ability to maintain situational awareness of the status of the infrastructure elements that constitute cyberspace. Event logs from cyber devices can yield significant information, and when properly utilized they can provide timely situational awareness about the state of the cyber infrastructure. In addition, proper Information Assurance requires the validation and verification of the integrity of results generated by a commercial log analysis tool. Event log analysis can be performed using relational databases. To enhance database query performance, previous literatures affirm denormalization of databases. Yet database normalization can also increase query performance. Database normalization improved the majority of the queries performed using very large data sets of router events. In addition, queries performed faster on normalized tables when all the necessary data were contained in the normalized tables. Database normalization improves table organization and maintains better data consistency than a lack of normalization. Nonetheless, there are some tradeoffs when normalizing a database, such as additional preprocessing time and extra storage requirements. But overall, normalization improved query performance and must be considered an option when analyzing event logs using relational databases. There are three primary research questions addressed in this thesis: (1) What standards exist for the generation, transport, storage, and analysis of event log data for security analysis?; (2) How does database normalization impact query performance when using very large data sets (over 30 million) of router events?; and (3) What are the tradeoffs between using a normalized versus non-normalized database in terms of preprocessing time, query performance, storage requirements, and database consistency?

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2012
Accession Number
ADA557999

Entities

People

  • Smile T. Ludovice

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Cyber
  • Human Systems

DTIC Thesaurus Topics

  • Air Force
  • Computer Network Security
  • Computers
  • Cybersecurity
  • Data Analysis
  • Data Mining
  • Databases
  • Information Processing
  • Information Science
  • Information Security
  • Information Systems
  • Knowledge Management
  • Network Protocols
  • Operating Systems
  • Security Personnel
  • Situational Awareness
  • Transport Protocols

Fields of Study

  • Computer science

Readers

  • Cybersecurity.
  • Database Systems and Applications
  • Regression Analysis.

Technology Areas

  • Cyber