引用本文: | 何花,谢明昆,黄圣君.基于不稳定性采样的主动学习方法.[J].国防科技大学学报,2022,44(3):50-56.[点击复制] |
HE Hua,XIE Mingkun,HUANG Shengjun.Active learning method based on instability sampling[J].Journal of National University of Defense Technology,2022,44(3):50-56[点击复制] |
|
|
|
本文已被:浏览 4563次 下载 4091次 |
基于不稳定性采样的主动学习方法 |
何花,谢明昆,黄圣君 |
(南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106)
|
摘要: |
传统的主动学习方法往往仅基于当前的目标模型来挑选样本,而忽略了历史模型所蕴含的对未标注样本预测稳定性的信息。因此,提出基于不稳定性采样的主动学习方法,依据历史模型的预测差异来估计每个未标注样本对提高模型性能的潜在效用。该方法基于历史模型对样本的预测后验概率之间的差异来衡量无标注样本的不稳定性,并挑选最不稳定的样本进行查询。在多个数据集上的大量实验结果验证了方法的有效性。 |
关键词: 主动学习 标注代价 不稳定性 后验概率 熵 |
DOI:10.11887/j.cn.202203007 |
投稿日期:2021-06-04 |
基金项目:新一代人工智能重大资助项目(2020AAA0107000);江苏省自然科学基金资助项目(BK20211517) |
|
Active learning method based on instability sampling |
HE Hua, XIE Mingkun, HUANG Shengjun |
(College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China)
|
Abstract: |
Traditional active learning methods select examples by only considering the predictions of the current model. However, these methods neglect the information of the previous trained models, which reflect the stability of the prediction sequence for each unlabeled example during the active learning stage. Thus, a novel active learning method with instability sampling was proposed, which attempted to estimate the potential utility of each unlabeled examples for improving the model performance based on the difference among predictions of the previous models. The proposed method measured the instability of unlabeled example based on the difference between the posterior probabilities predicted by the previous models, and the example with the largest instability was selected to be queried. Extensive experiments were conducted on multiple datasets with diverse classification models. The experimental results validate the effectiveness of the proposed method. |
Keywords: active learning labeling cost instability posterior probability entropy |
|
|
|
|
|