Towards a Cross-Domain MapReduce Framework

Abstract

The Apache(trademark) Hadoop(registered name) framework provides parallel processing and distributed data storage capabilities that data analytics applications can utilize to process massive sets of raw data. These Big Data applications typically run as a set of MapReduce jobs to take advantage of Hadoop's ease of service deployment and large-scale parallelism. Yet, Hadoop has not been adapted for multilevel secure (MLS) environments where data of different security classifications co-exist. To solve this problem, we have used the Security Enhanced Linux (SELinux) Linux kernel extension in a prototype cross-domain Hadoop on which multiple instances of Hadoop applications run at different sensitivity levels. Their accesses to Hadoop resources are constrained by the underlying MLS policy enforcement mechanism. A benefit of our prototype is its extension of the Hadoop Distributed File System to provide a cross-domain readdown capability for Hadoop applications without requiring complex Hadoop server components to be trustworthy.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2013
Accession Number
ADA604704

Entities

People

  • Cynthia E. Irvine
  • Jean Khosalim
  • Mark A. Gondree
  • Thuy D. Nguyen

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Sensors

DTIC Thesaurus Topics

  • Big Data
  • Classification
  • Computer Access Control
  • Computer Science
  • Computers
  • Computing System Architectures
  • Cross Domain
  • Cybersecurity
  • Data Analysis
  • Data Sets
  • Data Storage Systems
  • Department Of Defense
  • Environment
  • Operating Systems
  • Prototypes
  • Security
  • Sensitivity

Fields of Study

  • Computer science

Readers

  • Cybersecurity.
  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.