引用本文: | 刘仲,陈海燕,向宏卫.使用融合乘加加速快速傅里叶变换计算的向量化方法.[J].国防科技大学学报,2015,37(2):72-78.[点击复制] |
LIU Zhong,CHEN Haiyan,XIANG Hongwei.Vectorization of accelerating fast Fourier transform computation based on fused multiply-add instruction[J].Journal of National University of Defense Technology,2015,37(2):72-78[点击复制] |
|
|
|
本文已被:浏览 8357次 下载 7163次 |
使用融合乘加加速快速傅里叶变换计算的向量化方法 |
刘仲, 陈海燕, 向宏卫 |
(国防科技大学 计算机学院, 湖南 长沙 410073)
|
摘要: |
融合乘加指令加速快速傅里叶变换计算的向量化方法,通过变换快速傅里叶变换的蝶形单元运算流程,将传统计算方式中独立的乘法和加法操作组合成次数更少的融合乘加操作,使得时间抽取法基2 快速傅里叶变换算法的蝶形单元计算的实数浮点操作由原来的10次乘(加)操作减少到6次融合乘加操作,时间抽取法基4 快速傅里叶变换算法的蝶形单元计算的实数浮点操作由原来的34次乘(加)操作减少到24次融合乘加操作;优化了蝶形因子的向量访问,减少存储开销。实验结果表明,提出的方法能够显著加速快速傅里叶变换的计算,取得高效的计算性能和效率。 |
关键词: 快速傅里叶变换 融合乘加 向量化 向量处理器 |
DOI:10.11887/j.cn.201502015 |
投稿日期:2014-06-12 |
基金项目:国家自然科学基金资助项目(61133007,61472432) |
|
Vectorization of accelerating fast Fourier transform computation based on fused multiply-add instruction |
LIU Zhong, CHEN Haiyan, XIANG Hongwei |
(College of Computer, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
A vectorization of accelerating fast Fourier transform computation based on fused multiply-add instruction was presented. Separate multiplication and addition operations in conventional computation were manipulated into less fused multiply-add operations by transforming process of fast Fourier transform butterfly computation, which decreased the real floating-point operations of radix-2 decimation in time fast Fourier transform butterfly computation from 10 multiplication (addition) operations to 6 multiply-add operations and decreased the real floating-point operations of radix-4 decimation in time fast Fourier transform butterfly computation from 34 multiplication (addition) operations to 24 multiply-add operations. Vector data access on twiddle factors was optimized to reduce memory cost. Experimental results show that the presented method can greatly accelerate fast Fourier transform computation and achieve efficient performance and efficiency. |
Keywords: fast Fourier transform fused multiply-add vectorization vector processor |
|
|