Exploiting Replication

Abstract

The focus of this chapter is on the use of data replication and replicated execution to obtain faster response time or fault-tolerance in distributed programs. These techniques can be critical in determining whether or not a network-based solution to an application problem will be feasible. For example, modular expansion and price-performance considerations argue for the use of distributed systems in factory automation settings. However, many factories contain devices controlled by dedicated processors that require realtime response. Any delay imposed on the controllers by the network must be bounded. This is hard to ensure because of possible packet loss and unpredictable load on remote servers. Consequently, such systems are forced to replicate or cache data needed by the controllers. This chapter explores a number of approaches to replication and distributed consistency issued. The treatment is applicable to a conventional local area network or a loosely coupled multiprocessor. The programs and computers in such systems fail benignly, by crashing without sending out incorrect messages. Processors do not have synchronized clocks, hence the failure of an entire site can thus only be detected unreliably, using timeouts. Message communication is assumed to be reliable but bursty, because packets can be lost and may have to be retransmitted.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 01, 1988
Accession Number: ADA196133

Entities

People

Ken Birman
Thomas A. Joseph

Organizations

Cornell University

Exploiting Replication

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers