FAWN: A Fast Array of Wimpy Nodes
Abstract
This paper introduces the FAWN-Fast Array of Wimpy Nodes-cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2-16GB) of flash memory into an ensemble capable of handling 700 queries per second per node while consuming fewer than 6 watts of power per node. We have designed and implemented a clustered key-value storage system, FAWN-DHT, that runs atop these node. Nodes in FAWN-DHT use a specialized log-like back-end hash-based database to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system's ability to service queries at its guaranteed rate. Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming 3-10x less power.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 2008
- Accession Number
- ADA490226
Entities
People
- Amar Phanishayee
- David G. Andersen
- Jason Franklin
- Lawrence Tan
- Vijay Vasudevan
Organizations
- Carnegie Mellon University