Malleable Caches
Abstract
Managing the memory hierarchy is important for providing good performance of data intensive computation. This effort has explored several techniques for managing the cache in a microprocessor. This report examines column caching, cache partitioning, and cache compression techniques, especially in regards to the Data Intensive System (DIS) benchmarks. As a result of this study it was found that compression can be added to caches to improve capacity, but creates problems of replacement strategy and fragmentation. These problems can be solved using partitioning. A dictionary-based compression scheme allows for reasonable compression and decompression latencies and compression ratios. Keeping the data in the dictionary from becoming stale can be avoided with a clock scheme. The performance gains of a PCC over a standard cache of equivalent size can be attributed to two factors. A PCC potentially stores more data than a standard cache, which can reduce capacity misses and a PCC has more associativity than a standard cache of equivalent size, which can reduce conflict misses. Various techniques can be used to reduce the latency involved in the compression and decompression process. Searching on part of the dictionary during compression, using multiple banks or CAMs to examine multiple dictionary entries simultaneously, and compressing a cache line starting at different points in parallel can reduce compression latency. Finally, there are many different compression schemes some of which may perform better or be easier to implement in hardware.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2002
- Accession Number
- ADA408487
Entities
People
- Larry Rudolph
- Srini Devadas
Organizations
- Massachusetts Institute of Technology