Memory optimization method for control flow computation graph
CSTR:
Author:
Affiliation:

1.School of Internet, Anhui University, Heifei 230039 , China ; 2.iFLYTEK Co., Ltd., Heifei 230026 , China

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    AI chips face on-chip memory limits in deep learning. Current optimization methods focus on static computation graphs, leaving room to improve memory efficiency for dynamic graphs. To overcome this limitation, a memory optimization framework for control-flow computation graphs was developed. The framework realized operator-level memory reuse within subgraphs and further achieved recursive reuse across subgraphs by exploiting control-flow characteristics. In addition, a ping-pong buffering strategy for weight data was introduced to mitigate the memory wall between on-chip and off-chip memory, thereby allowing overlapping of memory access and computation operations within subgraphs. Validation on the domestic LUNA AI chip has demonstrated that the proposed framework improves on-chip memory utilization by 5.9% compared with existing methods. Moreover, the strategy effectively alleviates the memory wall problem by reducing data transfer time between on-chip and off-chip memory, resulting in execution efficiency improvements of up to 29%.

    Reference
    Related
    Cited by
Get Citation

王向前, 申彧昊, 景琨, 等. 针对控制流计算图的内存优化方法[J]. 国防科技大学学报, 2025, 47(6): 71-80.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 06,2025
  • Revised:
  • Adopted:
  • Online: December 02,2025
  • Published:
Article QR Code