Memory Optimization Method for Control Flow Computation Graph
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Artificial intelligence (AI) chips face on-chip memory limits in deep learning. Current optimization methods focus on static computation graphs, leaving room to improve memory efficiency for dynamic graphs. To overcome this limitation, a memory optimization framework for control-flow computation graphs was developed. The framework realized operator-level memory reuse within subgraphs and further achieved recursive reuse across subgraphs by exploiting control-flow characteristics. In addition, a ping-pong buffering strategy for weight data was introduced to mitigate the memory wall between on-chip and off-chip memory, thereby allowing overlapping of memory access and computation operations within subgraphs. Validation on the domestic LUNA AI chip has demonstrated that the proposed framework improves on-chip memory utilization by 1% to 5.9% compared with existing methods. Moreover, the strategy effectively alleviates the memory wall problem by reducing data transfer time between on-chip and off-chip memory, resulting in execution efficiency improvements of up to 29%.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 06,2025
  • Revised:September 18,2025
  • Adopted:September 19,2025
  • Online: September 22,2025
  • Published:
Article QR Code