FAWN: A Fast Array of Wimpy Nodes

Abstract

This paper introduces the FAWN-Fast Array of Wimpy Nodes-cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2-16GB) of flash memory into an ensemble capable of handling 700 queries per second per node while consuming fewer than 6 watts of power per node. We have designed and implemented a clustered key-value storage system, FAWN-DHT, that runs atop these node. Nodes in FAWN-DHT use a specialized log-like back-end hash-based database to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system's ability to service queries at its guaranteed rate. Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming 3-10x less power.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: May 01, 2008
Accession Number: ADA490226

Entities

People

Amar Phanishayee
David G. Andersen
Jason Franklin
Lawrence Tan
Vijay Vasudevan

Organizations

Carnegie Mellon University

FAWN: A Fast Array of Wimpy Nodes

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers