Checkpoint Space Reclamation for Independent Checkpointing in Message- Passing Systems
Abstract
The main disadvantages of independent checkpointing in message-passing systems are the possible domino effect and the associated storage space overhead for maintaining multiple checkpoints. In most previous research on checkpointing and recovery, it has been assumed that only the checkpoints older than the global recovery line can be discarded. In this paper, we generalize the notion of a recovery line to that of a potential recovery line. Only the checkpoints belonging to at least one of the potential recovery lines can not be discarded. By using the model of maximum-sized antichains on a partially ordered set, an efficient algorithm is developed for finding all non-discardable checkpoints and an upper bound on the number of non-discardable checkpoints is derived. Communication trace-driven simulation for several parallel programs is used to show the benefits of the proposed algorithm for real applications.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1992
- Accession Number
- ADA251923
Entities
People
- In-jen Lin
- Pi-yu Chung
- W. Kent Fuchs
- Yi-min Wang
Organizations
- University of Illinois Urbana–Champaign