Chaining for Flexible and High-Performance Key-Value Systems

Abstract

Distributed key-value (KV) systems are a critical part of the infrastructure at many large sites such as Amazon, Facebook, Google, and Twitter. The first research question this dissertation addresses is: How should we design a cluster-based key-value store that is fault tolerant achieves high performance and availability, and offers strong data consistency? We present a new replication protocol, Ouroboros, which extends chain-based replication to allow fast non-blocking node additions to any part of the replica chain, and guarantees provably strong data consistency. We use Ouroboros to implement a distributed key-value system, FAWN-KV designed with the goal of supporting the three key properties of fault tolerance, high performance and generality. We present a formal proof of correctness of Ouroboros, and evaluate FAWN-KV on clusters with Flash storage. FAWN-KV is, still, only a specific KV solution that offers strong data consistency and is optimized for clusters that have storage devices with slow random writes. The current diversity in hardware and application requirements have resulted in a plethora of KV systems today, with no one system meeting the needs of all applications. The second, and final research question this dissertation addresses is therefore: Is it possible for a KV architecture to be easily configured to support many points along the KV system design continuum? We present a generalization of chain-based replication, Ouroboros+, which extends Ouroboros to effectively support a wide range of application requirements by (a) selecting from different update protocols between replicas, and, (b) selecting a query node in a replica chain.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2012
Accession Number
ADA569965

Entities

People

  • Amar Phanishayee

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • C4I
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Big Data
  • Commerce
  • Computer Programming
  • Computer Science
  • Computers
  • Data Storage Systems
  • Databases
  • Energy Consumption
  • Failure Mode And Effect Analysis
  • Fault Tolerance
  • Information Science
  • Operating Systems
  • Relational Database Management Systems
  • Relational Databases
  • Social Media
  • Systems Engineering
  • Theses

Fields of Study

  • Computer science

Readers

  • Parallel and Distributed Computing.