引用本文: | 裴颂文,吴小东,唐作其,等.异构千核处理器系统的统一内存地址空间访问方法.[J].国防科技大学学报,2015,37(1):28-33.[点击复制] |
PEI Songwen,WU Xiaodong,TANG Zuoqi,et al.An approach to accessing unified memory address space of heterogeneous kilo-cores system[J].Journal of National University of Defense Technology,2015,37(1):28-33[点击复制] |
|
|
|
本文已被:浏览 11558次 下载 8715次 |
异构千核处理器系统的统一内存地址空间访问方法 |
裴颂文1,2,3, 吴小东1, 唐作其4, 熊乃学5 |
(1.上海理工大学 计算机科学与工程系,上海 200093;2.
2. 中国科学院 计算机体系结构国家重点实验室,北京 100190;3.
3. 加利福尼亚大学 电气工程与计算机科学系,加利福尼亚 92697;4. 贵州大学 计算机科学与技术学院,贵州 贵阳 550025;5. 科罗拉多科技大学 计算机科学学院,科罗拉多 80907)
|
摘要: |
为了达到异构多核处理器能直接交叉访问对方的内存地址空间的目的,通过构建统一的三级Cache结构和数据块状态标记方法,并优化Cache块状态的修改算法,提出了异构千核处理器系统的统一内存地址空间访问方法,避免了当前独立式异构计算机系统结构下复制和传输数据块所带来的大量额外访存开销。通过采用部分Rodinia基准测试程序测试,获得了最高9.8倍的系统加速比,最多减少了90%的访存频率。因此,采用该方法能有效减少异构核心间交换数据块所带来的系统开销,提高异构千核处理器的系统性能加速比。 |
关键词: 异构千核处理器 内存地址空间 交叉式直接访问 Cache |
DOI:10.11887/j.cn.201501005 |
投稿日期:2014-06-10 |
基金项目:计算机体系结构国家重点实验室开放资助项目(CARCH201206);上海理工大学国家级项目培育基金资助项目(12XGQ07);贵阳市科技计划项目(2011101414);贵州省科技支撑项目(20123050) |
|
An approach to accessing unified memory address space of heterogeneous kilo-cores system |
PEI Songwen1,2,3, WU Xiaodong1, TANG Zuoqi4, XIONG Naixue5 |
(1.Department of Computer Science & Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China;2.
2.State Key Laboratory of Computer Architecture, Chinese Academy of Sciences, Beijing 100190, China;3.
3.Department of Electrical Engineering and Computer Science, University of California, California 92697, United States;4.School of Computer Science and Technology, University of Guizhou, Guiyang 550025, China;5.School of Computer Science, Colorado Technical University, Colorado 80907, United States)
|
Abstract: |
In order to access independent memory space of CPU and GPU directly from opposite directions, an effective approach to accessing unified memory address space of heterogeneous kilo-cores system is proposed, which is implemented by building a unified 3-level Cache and tagging blocks in Cache, and optimizing the algorithms of modifying the states of blocks. Therefore, the heterogeneous kilo-cores system avoids significant overhead of accessing memory instead of that in current discrete hybrid computer system equipped with GPUs by PCI-E. According to the results of experiments from partial programs of Rodinia benchmarks, a maximal speedup by 9.8x and maximal decrease of load/store instructions by 90% are gained. In conclusion, it’s certified that our solution is effective to decrease overhead of transferring data among computing units in heterogeneous system and significantly enhance the whole system computing performance. |
Keywords: heterogeneous kilo-cores processors memory address space directly access from opposite directions Cache |
|
|
|
|
|