考虑层敏感性的卷积神经网络混合精度量化方法
DOI:
作者:
作者单位:

国防科技大学电子科学学院

作者简介:

通讯作者:

中图分类号:

TP35

基金项目:

国家自然科学基金资助项目(62074166,62304254,62104256,62404253,U23A20322)


Convolutional Neural Network Mixed-Precision Quantization Considering Layer Sensitivity
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献()
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对如何将神经网络保真映射到资源受限的嵌入式设备这一问题,提出了一种基于层敏感性分析的卷积神经网络混合精度量化方法,通过计算Hessian矩阵平均迹衡量卷积层参数的敏感性,为位宽分配提供依据;使用逐层升降方法进行位宽分配,最终完成网络模型的混合精度量化。实验结果表明,与DoReFa和LSQ+两种固定精度量化方法相比,所提出的混合精度量化方法在平均位宽为3比特的情况下,将识别准确率提高了10.2%和1.7%;与其它混合精度量化方法相比,所提方法识别准确率提高了1%以上。此外,加噪训练能够有效提高混合精度量化方法的鲁棒性,在噪声标准差为0.5的情况下,将识别准确率提高了16%。

    Abstract:

    To address the problem of how to faithfully map neural networks to resource-constrained embedded devices, a mixed-precision quantization method for convolutional neural networks based on layer sensitivity analysis is proposed. The sensitivity of convolutional layer parameters is measured by calculating the average trace of the Hessian matrix, providing a basis for bit-width allocation. A layer-wise ascending-descending approach is employed for bit-width allocation, ultimately achieving mixed-precision quantization of the network model.Experimental results demonstrate that compared to the fixed-precision quantization methods DoReFa and LSQ+, the proposed mixed-precision quantization method improves recognition accuracy by 10.2% and 1.7%, respectively, at an average bit-width of 3 bits. When compared to other mixed-precision quantization methods, the proposed approach achieves over 1% higher recognition accuracy. Additionally, noise-injected training effectively enhances the robustness of the mixed-precision quantization method, improving recognition accuracy by 16% under a noise standard deviation of 0.5.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-01-10
  • 最后修改日期:2025-04-03
  • 录用日期:2025-04-07
  • 在线发布日期:
  • 出版日期:
文章二维码