Risk-Aware Data Processing in Hybrid Clouds

Abstract

This paper explores query processing in a hybrid cloud model where a user's local computing capability is exploited alongside public cloud services to deliver an efficient and secure data management solution. Hybrid clouds offer numerous economic advantages including the ability to better manage data privacy and confidentiality, as well as exerting control on monetary expenses of consuming cloud services by exploiting local resources. Nonetheless, query processing in hybrid clouds introduces numerous challenges, the foremost of which is, how to partition data and computation between the public and private components of the cloud. The solution must account for the characteristics of the workload that will be executed, the monetary costs associated with acquiring/operating cloud services as well as the risks affiliated with storing sensitive data on a public cloud. This paper proposes a principled framework for distributing data and processing in a hybrid cloud that meets the conflicting goals of performance, disclosure risk and resource allocation cost. The proposed solution is implemented as an add-on tool for a Hadoop and Hive based cloud computing infrastructure.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2011
Accession Number
ADA556625

Entities

People

  • Bhavani Thuraisingham
  • Bijit Hore
  • Kerim Y. Oktay
  • Murat Kantarcıoğlu
  • Sharad Mehrotra
  • Vaibhav Khadilkar

Organizations

  • University of Texas at Dallas

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Cloud Computing
  • Cloud Storage
  • Computations
  • Construction
  • Cost Models
  • Data Analysis
  • Data Centers
  • Data Processing
  • Data Storage Systems
  • Databases
  • Distributed Data Processing
  • Information Processing
  • Information Science
  • Information Security
  • Infrastructure
  • Workload

Readers

  • Cybersecurity.
  • Distributed Systems and Data Platform Development