Inner-out subdomain dividing heterogeneous parallel algorithm for high order CFD solver
Author:
Affiliation:

(Institute for Quantum Information & State Key Laboratory of High Performance Computing, College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China)

Clc Number:

TN95

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    An Offload-mode heterogeneous parallel algorithm via inner-out subdomain dividing was proposed for CFD(computational fluid dynamics) program CNS. Combined with the characteristics of finite difference computing and fourth order Runge-Kutta method in structure mesh, the scheme of ghost region was introduced, based on which a Ghost-Region-Shrinking computing scheme was designed, significantly reducing the overhead of data movement between heterogeneous computing resources, making the computing and MPI communication on CPU absolutely overlap with the accelerator computing under load balance condition, bringing better heterogeneous synergetic parallelism. Parameter of the ghost region for the computing validity was given and load balance tuning was demonstrated. On a server with CPU (Intel Haswell Xeon E5-2670 12 cores×2)+MIC (Xeon Phi 7120A ×2), an averaged performance improvement of 5.9× was gained over the algorithm of using accelerator with task blocks integrally. Compared with MPI/OpenMP two-level parallel algorithm running on 24 Intel Haswell CPU cores, the proposed method achieved an accelerating of 1.27× with one MIC and 1.45× with two MICs. Finally the bottleneck and disadvantage were discussed.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 10,2019
  • Revised:
  • Adopted:
  • Online: April 29,2020
  • Published: April 28,2020
Article QR Code