引用本文: | 陈海燕,刘胜,吴健虢.GSVM:一种支持Gather/Scatter的向量存储器.[J].国防科技大学学报,2020,42(3):1-8.[点击复制] |
CHEN Haiyan,LIU Sheng,WU Jianguo.GSVM:a vector memory to support Gather/Scatter[J].Journal of National University of Defense Technology,2020,42(3):1-8[点击复制] |
|
|
|
本文已被:浏览 7276次 下载 6184次 |
GSVM:一种支持Gather/Scatter的向量存储器 |
陈海燕,刘胜,吴健虢 |
(国防科技大学 计算机学院, 湖南 长沙 410073)
|
摘要: |
宽单指令多数据流(Single Instruction Multiple Data, SIMD)架构数字信号处理器一般都能高效支持地址连续或等距跨步等规则应用的向量访存,但对于科学与工程计算中广泛存在的不规则应用的数据访存则带宽利用率往往较低,从而大幅降低了其整体运算能效。为了提高不规则应用的向量访存性能,基于某SIMD 数字信号处理器的体系结构,设计了一种支持Gather/Scatter访存的向量存储器GSVM。通过设计与SIMD宽度相匹配的向量地址计算单元和合适深度的冲突缓冲器阵列,实现了Gather/Scatter指令向量地址计算、仲裁与缓存的全流水访存操作。实验结果表明,相比以前不支持Gather/Scatter访存的存储器,GSVM在增加22%的硬件代价基础上,基于稀疏矩阵向量乘的测试程序集获得了2~8的性能加速比。 |
关键词: 单指令多数据流 Gather/Scatter 向量随机访存 访存冲突 |
DOI:10.11887/j.cn.202003001 |
投稿日期:2019-04-30 |
基金项目:国家自然科学基金青年科学基金资助项目(61602493) |
|
GSVM:a vector memory to support Gather/Scatter |
CHEN Haiyan, LIU Sheng, WU Jianguo |
(College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
The very wide SIMD (single instruction multiple data) digital signal processor often supports vector memory access mode of regular applications with contiguous or equal address strides, but its access bandwidth utilization is usually very low for the irregular application accesses that exist widely in scientific and engineering computations. It reduces the overall computing performance of digital signal processor. In order to improve the vector access performance for the irregular applications, the vector memory called GSVM to support the Gather/Scatter access was designed on the basis of the architecture of a SIMD digital signal processor. The vector address generation unit and the conflict buffer array matching the SIMD width were designed to realize the full pipeline operations of Gather/Scatter instruction. The experimental results show that compared with the vector memory without Gather/Scatter, the GSVM obtains 2~8 times speedup for sparse-matrix vector multiplication test programs with the 22% hardware overhead. |
Keywords: single instruction multiple data Gather/Scatter vector random access access conflicts |
|
|
|
|
|