(1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;2. University of Chinese Academy of Sciences, Beijing 100049, China;3. Peng Cheng Laboratory, Shenzhen 518000, China)
Clc Number:
TN95
Fund Project:
Article
|
Figures
|
Metrics
|
Reference
|
Related
|
Cited by
|
Materials
|
Comments
Abstract:
A novel asynchronous memory access path, which supports highly concurrent and out-of-order off-chip memory requests was proposed. In order to satisfy the requirements of graph applications, a software-defined interface in our proposed memory access path to handle hundreds of kinds of off-chip memory requests with arbitrary granularity via hardware-software co-design methodology was implemented. A custom memory semantic interconnect was designed for fine-grained remote memory access among various computing nodes leveraged in future distributed graph processing scenarios. Last but not least, we integrate our proposed novel memory access path into a RISC-V instruction set architecture-based SoC(system-on-chip) architecture and implement an FPGA prototype. Based on our custom random access microbenchmarks, preliminary evaluation results show that performance of array-based and random address-based off-chip memory access is improved by 3.5x and 2.7x respectively using our proposed asynchronous memory access path, and accessing 4 bytes data from remote memory only takes 1.63 μs.