A Fault-Tolerant Network Kernel for Linda
Abstract
The parallel programming system Linda consists of number of processes and a shared memory called the tuple space. In a distributed implementation of Linda, processes and the tuple space reside on different computing nodes connected by a communications network subject to variety of node and network failures. This thesis develops a scheme to make tuple space highly-available in the presence of failures. High-availability is achieved by replication: the tuple space is replicated on several modes so that failures usually do not disrupt program execution. Our replication method has two parts: the operations protocol and the view change algorithm. The operations protocol is a read-one- write-all scheme, that is, values are read from one of the replicas and write operations are executed at all replicas. The protocol exploits the semantics of the tuple space operations to eliminate unnecessary delay in program execution. When failures occur, the replicas are reorganized and their states are updated. This process is called a view change and is accomplished by the view change algorithm. A view change guarantees that newly formed view consists of a majority of the replicas, that all updates survive into the view. Together, the operations protocol and the view change algorithm ensure that operations are executed in the correct order, updates to tuple space survive failures, and processes only see the correct tuple space state in spite of failures. In addition, operations are performed by a concurrent background process whenever possible. Keywords: Fault-tolerance, Distributed computer systems.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 01, 1988
- Accession Number
- ADA200986
Entities
People
- Andrew S. Xu
Organizations
- Massachusetts Institute of Technology