引用本文: | 周天阳,王庆林,李荣春,等.面向众核处理器的阴阳K-means算法优化.[J].国防科技大学学报,2024,46(1):93-102.[点击复制] |
ZHOU Tianyang,WANG Qinglin,LI Rongchun,et al.Optimizing Yinyang K-means algorithm on many-core CPUs[J].Journal of National University of Defense Technology,2024,46(1):93-102[点击复制] |
|
|
|
本文已被:浏览 14148次 下载 2277次 |
面向众核处理器的阴阳K-means算法优化 |
周天阳1,2,王庆林1,2,李荣春1,2,梅松竹1,2,尹尚飞1,2,郝若晨1,2,刘杰1,2 |
(1. 国防科技大学 计算机学院, 湖南 长沙 410073;2. 国防科技大学 并行与分布计算全国重点实验室, 湖南 长沙 410073)
|
摘要: |
传统阴阳K-means算法处理大规模聚类问题时计算开销十分昂贵。针对典型众核处理器的体系结构特征,提出了一种阴阳K-means算法高效并行加速实现。该实现基于一种新内存数据布局,采用众核处理器中的向量单元来加速阴阳K-means中的距离计算,并面向非一致内存访问(non-unified memory access, NUMA)特性进行了针对性的访存优化。与阴阳K-means算法的开源多线程实现相比,该实现在ARMv8和x86众核平台上分别获得了最高约5.6与8.7的加速比。因此上述优化方法在众核处理器上成功实现了对阴阳K-means算法的加速。 |
关键词: K-means 非一致内存访问 向量化 众核处理器 性能优化 |
DOI:10.11887/j.cn.202401010 |
投稿日期:2022-09-06 |
基金项目:国家自然科学基金资助项目(62002365) |
|
Optimizing Yinyang K-means algorithm on many-core CPUs |
ZHOU Tianyang1,2, WANG Qinglin1,2, LI Rongchun1,2, MEI Songzhu1,2, YIN Shangfei1,2, HAO Ruochen1,2, LIU Jie1,2 |
(1. College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China;2. National Key Laboratory of Parallel and Distributed Computing, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
Traditional Yinyang K-means algorithm is computationally expensive when dealing with large-scale clustering problems. An efficient parallel acceleration implementation of Yinyang K-means algorithm was proposed on the basis of the architectural characteristics of typical many-core CPUs. This implementation was based on a new memory data layout, used vector units in many-core CPUs to accelerate distance calculation in Yinyang K-means, and targeted memory access optimization for NUMA(non-uniform memory access) characteristics. Compared with the open source multi-threaded version of Yinyang K-means algorithm, this implementation can achieve the speedup of up to 5.6 and 8.7 approximately on ARMv8 and x86 many-core CPUs, respectively. Experiments show that the optimization successfully accelerate Yinyang K-means algorithm in many-core CPUs. |
Keywords: K-means NUMA vectorization many-core CPU performance optimization |
|
|
|
|
|