Using Time to Improve the Performance of Coordinated Checkpointing,

Abstract

This paper describes and evaluates a coordinated checkpoint protocol that uses time to eliminate several peformance overheads that are present in traditional protocols. The time-based protocol does not have to exchange coordination messages, does not need to add information to the processes' messages, and only accesses stable storage when checkpoints are saved. This protocol uses a simple initialization procedure to set checkpoint timers at the dlfferent processes. After the initialization, each process saves its state independently from the other processes. By disallowing processes from sending messages during an interval before the checkpoint time, the protocol prevents in-transit messages from occurring. Two coordinated checkpoint protocols were implemented on a CM5, and their performance was compared using several applications. Results showed that the time-based protocol outperforms the two-phase protocol in all applications.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1996
Accession Number
ADA310228

Entities

People

  • Nuno Neves
  • W. Kent Fuchs

Organizations

  • University of Illinois Urbana–Champaign

Tags

Communities of Interest

  • Space

DTIC Thesaurus Topics

  • Algorithms
  • Bandwidth
  • Clocks
  • Communication Channels
  • Computations
  • Computers
  • Consistency
  • Engineering
  • Frequency
  • Genetic Algorithms
  • Guarantees
  • Intervals
  • Iterations
  • Linear Programming
  • Optimization
  • Recovery
  • Time Intervals

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computer Networking
  • Parallel and Distributed Computing.