Diagnostic methods for communication waiting in MPI parallel programs and applications
Author:
Affiliation:

(Institute of Applied Physics and Computational Mathematics, Beijing 100094, China)

Clc Number:

TP316.4

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    As the increasing of the scale of parallel systems, some problems such as large measurement cost and memory overhead exist in the diagnostic methods of communication waiting phenomenon. With the deep analysis on the existing diagnostic methods, and considering the actual demand of controllable measurement, a diagnosis model for communication waiting based on hotspot function was established, and a tidy and practical diagnostic method based on the above model was presented. The above diagnostic method was applied to the diagnostic process of the communication waiting phenomenon in the large-scale MPI parallel programs, such as the LARED integration, the LARED-S, the LAP3D. The application results show that this method can accurately identify the key code segment leading to communication waiting and the proposed optimization solution and performance improvement space has reference value for the subsequent program improvement. The optimized LARED-S program, according to the diagnostic result, can increase performance by 32% and reduce communication waiting time by 44%.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 20,2019
  • Revised:
  • Adopted:
  • Online: April 29,2020
  • Published: April 28,2020
Article QR Code