<![CDATA[Editorial department of the Journal of National University of Defense Technology -->高性能计算与人工智能]]>

<![CDATA[Editorial department of the Journal of National University of Defense Technology -->高性能计算与人工智能]]> <![CDATA[Optimizing Yinyang <i>K</i>-means algorithm on many-core CPUs]]> K-means algorithm is computationally expensive when dealing with large-scale clustering problems. An efficient parallel acceleration implementation of Yinyang K-means algorithm was proposed on the basis of the architectural characteristics of typical many-core CPUs. This implementation was based on a new memory data layout, used vector units in many-core CPUs to accelerate distance calculation in Yinyang K-means, and targeted memory access optimization for NUMA(non-uniform memory access) characteristics. Compared with the open source multi-threaded version of Yinyang K-means algorithm, this implementation can achieve the speedup of up to 5.6 and 8.7 approximately on ARMv8 and x86 many-core CPUs, respectively. Experiments show that the optimization successfully accelerate Yinyang K-means algorithm in many-core CPUs.]]> 2024/1/28 0:00:00 ZHOU Tianyang, WANG Qinglin, LI Rongchun, MEI Songzhu, YIN Shangfei, HAO Ruochen, LIU Jie 6true <![CDATA[Parallel optimization of convolution algorithm on multi-core DSP]]> 2024/1/28 0:00:00 XU Jinwei, WANG Qinglin, LI Yalin, JIANG Jingfei, GAO Lei, LI Rongchun, LI Dongsheng 5true <![CDATA[Quantization and pruning optimization method for attention mechanism]]> 2024/1/28 0:00:00 HE Yuanhong, JIANG Jingfei, XU Jinwei 4true <![CDATA[Efficient RNN inference engine on very long vector processor]]> 2024/1/28 0:00:00 SU Huayou, CHEN Kangkang, YANG Qianming 3true <![CDATA[Optimizing operator computation of MiniGo on high-performance heterogeneous accelerator]]> 2024/1/28 0:00:00 QIAO Peng, HE Zhouyu, LI Rongchun, JIANG Jingfei 2true <![CDATA[High-throughput LDPC decoder on GPU for 5G new radio]]> 2024/1/28 0:00:00 LI Rongchun, ZHOU Xin, QIAO Peng, WANG Qinglin 1true