Cache-Based Architectures for High Performance Computing
Abstract
Many researchers have noted that scientific codes perform poorly on computer architectures involving a memory hierarchy (cache). Furthermore, a number of researchers and some vendors concluded that simply making the caches larger would not solve this problem. Alternatively, some vendors of HPC systems have opted to equip their systems with fast memory interfaces, but with a limited amount of on-chip cache and no off-chip cache. Some RISC-based HPC systems supported some sort of prefetching or streaming facility that allows one to more efficiently stream data between main memory and the processor (e.g., the Cray T3E). However, there are fundamental limitations on the benefits of these approaches which makes it difficult to see how these approaches by themselves will eliminate the "Memory Wall." It has been shown that if one relies solely on this approach for the Cray T3E, one is unlikely to achieve much better than 4-6% of the machine's peak performance. Does this mean that as the speed of RISC/CISC processors increases, systems designed to process scientific data are doomed to hit the Memory Wall? The answer to that question depends on the ability of programmers to find innovative ways to take advantage of caches. This report discusses some of the techniques that can be used to overcome this hurdle allowing one to consider what types of hardware resources are required to support these techniques.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 2002
- Accession Number
- ADA399720
Entities
People
- Daniel M. Pressel
Organizations
- United States Army Research Laboratory