MPI并行程序中通信等待问题的诊断方法及其应用
作者:
作者单位:

(北京应用物理与计算数学研究所, 北京 100094)

作者简介:

武林平(1977—),男,河南焦作人,研究员,博士,E-mail:wu_linping@mail.iapcm.ac.cn; 景翠萍(通信作者),女,助理研究员,硕士,E-mail:xjtu_cs@163.com

通讯作者:

中图分类号:

TP316.4

基金项目:

国家重点研发计划资助项目(2018YFB0204003);国家自然科学基金资助项目(61672003);国家自然科学基金青年科学基金资助项目(11601034)


Diagnostic methods for communication waiting in MPI parallel programs and applications
Author:
Affiliation:

(Institute of Applied Physics and Computational Mathematics, Beijing 100094, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着并行规模的扩大,现有通信等待问题的诊断方法存在内存开销大、测量时间开销大等问题。通过对现有通信等待问题诊断方法的深入分析,同时考虑测量开销可控的实际需求,建立基于热点函数的通信等待问题诊断模型。基于上述模型,总结出一种更精简、更实用的通信等待问题诊断方法。将该诊断方法分别应用到二维LARED集成、LARED-S、LAP3D等大规模MPI并行程序的通信等待问题诊断过程,应用效果表明本诊断方法可精确定位导致通信等待问题的关键代码段,给出的优化方案及性能提升空间对于后续的程序改进具有参考价值,其中根据诊断结果优化后的LARED-S程序性能提升32%,通信等待时间减少44%。

    Abstract:

    As the increasing of the scale of parallel systems, some problems such as large measurement cost and memory overhead exist in the diagnostic methods of communication waiting phenomenon. With the deep analysis on the existing diagnostic methods, and considering the actual demand of controllable measurement, a diagnosis model for communication waiting based on hotspot function was established, and a tidy and practical diagnostic method based on the above model was presented. The above diagnostic method was applied to the diagnostic process of the communication waiting phenomenon in the large-scale MPI parallel programs, such as the LARED integration, the LARED-S, the LAP3D. The application results show that this method can accurately identify the key code segment leading to communication waiting and the proposed optimization solution and performance improvement space has reference value for the subsequent program improvement. The optimized LARED-S program, according to the diagnostic result, can increase performance by 32% and reduce communication waiting time by 44%.

    参考文献
    相似文献
    引证文献
引用本文

武林平,景翠萍,刘旭,等. MPI并行程序中通信等待问题的诊断方法及其应用. Diagnostic methods for communication waiting in MPI parallel programs and applications[J].国防科技大学学报,2020,42(2):47-54.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-09-20
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-04-29
  • 出版日期: 2020-04-28
文章二维码