External Memory Algorithms: Dealing With Massive Data
Abstract
The bottleneck in many applications that process massive amounts of data is the I/O communication between internal memory and external memory. The bottleneck is accentuated as processors get faster and parallel processors are used. The goal of this proposal is to deepen our understanding of the limits of I/O systems and massive data storage systems and to construct algorithms that are provably efficient. The three measures of performance are number of I/Os, disk storage space, and CPU time. Even when the data fit entirely in memory, communication can still be the bottleneck, and the related issues of caching become important.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 21, 2005
- Accession Number
- ADA440839
Entities
People
- Jeffrey S. Vitter
Organizations
- Purdue University