Library

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Distributed shared memory  (1)
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    The journal of supercomputing 11 (1997), S. 137-158 
    ISSN: 1573-0484
    Keywords: Distributed shared memory ; checkpointing ; fault-tolerance ; portability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Distributed shared memory (DSM) is a very promising programming model for exploiting the parallelism of distributed memory systems, because it provides a higher level of abstraction than simple message passing. Although the nodes of standard distributed systems exhibit high crash rates only very few DSM environments have some kind of support for fault-tolerance. In this article, we present a checkpointing mechanism for a DSM system that is efficient and portable. It offers some portability because it is built on top of MPI and uses only the services offered by MPI and a POSIX compliant local file system. As far as we know, this is the first real implementation of such a scheme for DSM. Along with the description of the algorithm we present experimental results obtained in a cluster of workstations. We hope that our research shows that efficient, transparent and portable checkpointing is viable for DSM systems.
    Type of Medium: Electronic Resource
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...