Research Area b: Cacheless Computer Architecture with Massively Parallel Optically-Interconnected Memory for Scalable, Low-Latency, Energy-Efficient, and High-Productivity Computing

Abstract

With the end of CMOS scaling and performance improvement through Moore s law alone, there is a trend towards ,heterogeneous computing with domain specific accelerators for different applications, along with big and little cores, multithreaded GPUs, FPGAs, with high bandwidth memory (HBM) integrated in the same package with horizontal (2.5D) and vertical (3D) die-stacking. Though this approach is a step in the right direction, the memory subsystem based on multiple levels of caches is a , bottlerfeck when. it comes to energy efficiency, scalability, and more importantly programmer productivity; Moreover, whh emerging workloads suchá as large-scale graph analytics and machine learning that have large working sets with irregular memory access patterns and poor data locality, the : behefits of caches are becoming increasingly questionable. The objective of this proposal is to rethink and rearchitect the memory subsystem in future processors for high performance irregular workloads by taking advantage of 2.5D/3D electronic-photonic integration and wavelength division multiplexing (WDM) based silicon photonics. More specifically, the goal is to flatten the memory hierarchy as far as possible by eliminating cache hieprchy to realize a memory system that exhibits low and predictable latency. This is, achieved through a new DRAM !)1icroarchitecture with embedded WDM-based photonic interconnects with all-to-all interconnection capability that is codesigned carefully with the processor and the memory controller. The key facet of the new DRAM microatchitecture is a significantly higher number of active mini-banks with integrated . WDM silicon photonics that provide much higher concurrency and as a result a significant reducing in queueing delays at the memory controller. Detl;liled models for the DRAM and the memory controllers will be developed and integrated with RISC-V based processor and optical interconnection network models, in the á gem5 simulation environment to áconduct a rigorous design space ástudy of the architectural parameters to evaluate the , perfonnance, scalability, and energy efficiency of the memory subsystem. Large scale graph data analytics, machine learning, and scientific computing kernels will be used as bench~,arks and the results will be compared with state-of.:the-art computing nodes from research and academia. An important deliverable of the project is a quantitative understanding of l_atency and energy per bit requirements of future DRAMs so that cache hierarchy can be eliminated for the benchmarks in consideration. This can have a broad impact on the design of.future DRAMs and processor architectures. Historically algorithms have evolved to take advantage of or overcome the limitations, ~fa traditional memory hierarchy, so- with a flatter memory hierarchy the proposed project can have an impact on how future algorithms are designed. Finally,,a flatter memory hierarchy is easier to program, so it can have a huge impact on productivity of applications developers in the future. 1 á ¥ . By leveraging collaboration with industry and commercial opto-electronic foundry services we will evaluate and assess technology transitron. feasibility of the proposed optically-interconnected memory system.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 04, 2019
Source ID
W911NF1910470

Entities

People

  • S. J. Ben Yoo

Organizations

  • Army Contracting Command
  • National Security Agency
  • University of California, Davis

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Integrated Circuit Design and Technology.
  • Parallel and Distributed Computing.

Technology Areas

  • AI & ML
  • Microelectronics
  • Space