引用本文: | 王一超,王鎏振,林新华.利用深度学习的硬件计数器复用估计算法.[J].国防科技大学学报,2022,44(5):114-123.[点击复制] |
WANG Yichao,WANG Liuzhen,LIN Xinhua.Hardware counter multiplexing estimation algorithm using deep learning[J].Journal of National University of Defense Technology,2022,44(5):114-123[点击复制] |
|
|
|
本文已被:浏览 4296次 下载 3717次 |
利用深度学习的硬件计数器复用估计算法 |
王一超,王鎏振,林新华 |
(上海交通大学 网络信息中心, 上海 200240)
|
摘要: |
利用深度学习方法,为硬件计数器复用(multiplexing,MPX)提供结果精度更高的估计模型。通过对MPX估计得到的结果与实际采集的真实数据进行相似性分析,证明相同程序多次运行之间得到的硬件计数值是线性相关的。采用神经网络多层感知器(multilayer perceptron,MLP)和双向门控神经网络(bidirectional gated recurrent unit, Bi-GRU)这2种深度学习模型,对MPX数据进行拟合。基于动态时间规整(dynamic time warping, DTW),提出一个全新的评估MPX数据精度的指标DTW-cost。实验结果表明,同时收集15个硬件事件数据时,MLP方法拟合得到的13个高性能计算应用平均准确率比现有使用最广的固定插值法高出10.53%,最多可提升19.8%;而在MLP表现较差的事件上,Bi-GRU方法得到的平均准确率提升了28.8%。 |
关键词: 硬件计数器 硬件性能事件 复用技术 深度学习 |
DOI:10.11887/j.cn.202205012 |
投稿日期:2021-12-31 |
基金项目:国家自然科学基金资助项目(62072300) |
|
Hardware counter multiplexing estimation algorithm using deep learning |
WANG Yichao, WANG Liuzhen, LIN Xinhua |
(Network & Information Center, Shanghai Jiao Tong University, Shanghai 200240, China)
|
Abstract: |
A state-of-art deep learning method was proposed to achieve higher accuracy of MPX(multiplexing) estimation. By analyzing the similarity between the MPX results and the real data, it was proved that hardware counts gained by running the same program was linear correlated. By applying the MLP(multilayer perceptron) and Bi-GRU(bidirectional gated recurrent unit) model, the MPX data was fitted. Based on DTW (dynamic time warping), a new metric DTW-cost was proposed to judge the accuracy of MPX result. Experiment results show that when sampling 15 hardware events simultaneously, average result of 13 high performance computing applications gained by the MLP model has a 10.53% higher relative accuracy than the fixed interpolation method. The MLP model has a 19.8% improvement at most. On the hardware events which MLP has a relatively poor performance, the Bi-GRU model improved relative accuracy score by 28.8% on average. |
Keywords: hardware counters hardware performance events multiplexing deep learning |
|
|
|
|
|