CLEAR: Cross-Layer Exploration for Architecting Resilience

Abstract

CLEAR is a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs(energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). CLEAR automatically and systematically explores the large space of techniques and their combinations (586 cross-layer combinations in this paper), derives cost-effective solutions, and provides guidelines for designing new techniques. Carefully optimized combinations of circuit-level hardening, logic-level parity checking, and micro-architectural recovery provide highly cost-effective soft error resilience for general-purpose processor cores. 50x silent data corruption rate improvement is achieved at 2.1% energy cost for out-of-order (6.1% for in-order) cores, with no speed impact. Selective circuit-level hardening alone, guided by thorough application benchmark analysis, also provides cost-effective solutions (~1% additional energy cost for the same 50x improvement).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2017
Accession Number
AD1041979

Entities

People

  • Chen-yong Cher
  • Eric Cheng
  • Hyungmin Cho
  • Jacob A. Abraham
  • Kevin Skadron
  • Klas Lilja
  • Lukasz G. Szafaryn
  • Mircea R. Stan
  • Pradip Bose
  • Shahrzad Mirkhani
  • Subhasish Mitra

Organizations

  • Stanford University

Tags

Communities of Interest

  • Engineered Resilient Systems

DTIC Thesaurus Topics

  • Algorithms
  • Circuits
  • Compilers
  • Complex Systems
  • Computer Architecture
  • Computer Science
  • Computers
  • Computing System Architectures
  • Costs
  • Detection
  • Energy
  • Energy Efficiency
  • Fault Tolerance
  • Hardening
  • Networks
  • Reliability
  • Test And Evaluation

Fields of Study

  • Computer science

Readers

  • Integrated Circuit Design and Technology.
  • Parallel and Distributed Computing.
  • Software Engineering.

Technology Areas

  • Space