引用本文: | 周隆放,杨文祥,韩永国,等.作业名层次化聚类算法预测作业运行时间.[J].国防科技大学学报,2022,44(5):13-23.[点击复制] |
ZHOU Longfang,YANG Wenxiang,HAN Yongguo,et al.Predicting the job running time with job name hierarchical clustering algorithm[J].Journal of National University of Defense Technology,2022,44(5):13-23[点击复制] |
|
|
|
本文已被:浏览 5674次 下载 3590次 |
作业名层次化聚类算法预测作业运行时间 |
周隆放1,2,杨文祥1,3,韩永国2,张晓蓉2,喻杰1,冯景华4,张健4,李宇奇4,鲜港1,2,吴亚东5,王桂娟2 |
(1. 中国空气动力研究与发展中心 计算空气动力研究所, 四川 绵阳 621000;2. 西南科技大学 计算机科学与技术学院, 四川 绵阳 621010;3. 国防科技大学 计算机学院, 湖南 长沙 410073;4. 国家超级计算天津中心, 天津 300457;5. 四川轻化工大学 计算机科学与工程学院, 四川 自贡 643000)
|
摘要: |
预测作业的运行时间有益于提升系统的调度性能,而聚类有助于训练出更好的预测模型。传统的聚类算法很难将相似的作业名聚类,为了将相似的作业更好地聚类,通过分析其组成成分的语义重要性,构建字母-结构-数字的作业名层次化聚类算法。以两台超级计算机的真实数据为例,实验结果发现,应用此算法聚类后的数据训练模型的预测精度相较传统方法有一定的提升,整体预测精度为70%~80%。 |
关键词: 运行时间预测 作业名聚类 机器学习 高性能计算 |
DOI:10.11887/j.cn.202205002 |
投稿日期:2021-12-28 |
基金项目:国家自然科学基金资助项目(61872304,61802320);四川省重点研发资助项目(2022YFG0040) |
|
Predicting the job running time with job name hierarchical clustering algorithm |
ZHOU Longfang1,2, YANG Wenxiang1,3, HAN Yongguo2, ZHANG Xiaorong2, YU Jie1, FENG Jinghua4, ZHANG Jian4, LI Yuqi4, XIAN Gang1,2, WU Yadong5, WANG Guijuan2 |
(1. Computational Aerodynamics Institute, China Aerodynamics Research and Development Center, Mianyang 621000, China;2. School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China;3. College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China;4. National Supercomputer Center in Tianjin, Tianjin 300457, China;5. School of Computer Science and Engineering, Sichuan University of Science & Engineering, Zigong 643000, China)
|
Abstract: |
Predicting the job running time is beneficial to improve the scheduling performance of the system, and the clustering can help to train better prediction models. Traditional clustering algorithms are difficult to cluster similar job names. In order to better cluster similar jobs, the job name hierarchical clustering algorithm of letter-structure-number was constructed by analyzing the semantic importance of their components. Taking the real data of two supercomputers as an example, the data clustered by this algorithm was used to train the model. The experimental results show that the prediction accuracy of the model is better than that of the traditional method, and the overall prediction accuracy is 70%~80%. |
Keywords: runtime prediction job name clustering machine learning high performance computing |
|
|
|
|
|