The Fault-tolerance Mechanism in Grid-basedDistributed Simulation System
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Aiming at the demand of the distributed simulation system, this paper has built a common grid-based fault tolerance system. The system consists of three parts: simulation resource monitoring module, data saving module, and error recovery module. The implementation of monitoring module is built on top of grid's MDS, while data saving module, including the saving of the process space and the iterative relationship between processes, and fault recovery are realized based on checkpoint mechanism in the user space. In addition, we analyze the relationship between these three modules and the existing function modules in simulation system. In the end, we design and implement a fault tolerance broker in Client/Sever mode to automate the fault tolerance.

    Reference
    Related
    Cited by
Get Citation

LIU Yunsheng, ZHANG Tong, ZHANG Chuanfu, ZHA Yabing. The Fault-tolerance Mechanism in Grid-basedDistributed Simulation System[J]. Journal of National University of Defense Technology,2005,27(1):35-38.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 06,2004
  • Revised:
  • Adopted:
  • Online: March 25,2013
  • Published:
Article QR Code