引用本文: | 孙彩霞,隋兵才,王蕾,等.乱序超标量处理器核的性能分析与优化.[J].国防科技大学学报,2016,38(5):14-19.[点击复制] |
SUN Caixia,SUI Bingcai,WANG Lei,et al.Counters based performance analysis and optimization of an out-of-order superscalar processor core[J].Journal of National University of Defense Technology,2016,38(5):14-19[点击复制] |
|
|
|
本文已被:浏览 10870次 下载 6688次 |
乱序超标量处理器核的性能分析与优化 |
孙彩霞, 隋兵才, 王蕾, 王永文, 黄立波, 李文哲, 王俊辉 |
(国防科技大学 计算机学院, 湖南 长沙 410073)
|
摘要: |
随着处理器微体系结构日益复杂,性能分析在处理器研制过程中的作用越来越重要。常用的性能分析方法是建立性能模型,该方法主要用于研制初期的设计空间探索,如果用于微体系结构级的分析和优化,速度和精度都会成为限制因素。因此,提出一种基于计数器的性能分析方法,该方法以项目组已经完成的一款处理器核的硬件实现代码为基础,在处理器核外部添加一个专用性能监测单元,收集微体系结构分析和优化需要的各种事件,并通过结果分析器对统计的事件进行分析,得到微体系结构实现的性能受限因素。采用此方法,在现场可编程门阵列原型系统上对SPEC CPU2000测试程序运行时的性能受限因素进行分析,并根据分析结果采取相应的优化措施,优化后的处理器核性能得到了明显提升。 |
关键词: 性能分析 计数器 处理器核 微体系结构 |
DOI:10.11887/j.cn.201605003 |
投稿日期:2015-11-25 |
基金项目:国家自然科学基金资助项目(61103011,61170045,61402501) |
|
Counters based performance analysis and optimization of an out-of-order superscalar processor core |
SUN Caixia, SUI Bingcai, WANG Lei, WANG Yongwen, HUANG Libo, LI Wenzhe, WANG Junhui |
(College of Computer, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
With the ever-increasing design complexity in the processor micro-architecture, performance analysis becomes more and more important in the research and design of processors. Performance models are used widely in the performance analysis, which are more suitable for the design space exploration in the early stage. When used in micro architecture optimizations, the accuracy and the speed of performance models are the limiting factors. Therefore, a performance analysis method based on counters was proposed. In this method, the RTL register transfer level code of a processor core was used as a baseline, and a specialized performance monitor unit was added to collect the events needed by the micro-architecture analysis and optimization. Then the collected events were sent to a result analyzer, where the factors affecting the performance were obtained. By a dopting the method, we analyzed what affects the performance in running SPEC CPU2000 benchmarks on FPGA(field-programmable gate array) prototyping, and optimized the micro-architecture of processor core according to the analysis results. The performance of the optimized processor core is improved obviously. |
Keywords: performance analysis counters processor core micro-architecture |
|
|
|
|
|