引用本文: | 汝承森,唐晋韬,谢松县,等.关系抽取中远监督错误标注消除.[J].国防科技大学学报,2018,40(3):148-152.[点击复制] |
RU Chengsen,TANG Jintao,XIE Songxian,et al.Reducing wrong labels in distant supervision for relation extraction[J].Journal of National University of Defense Technology,2018,40(3):148-152[点击复制] |
|
|
|
本文已被:浏览 7876次 下载 5865次 |
关系抽取中远监督错误标注消除 |
汝承森, 唐晋韬, 谢松县, 李莎莎, 王挺 |
(国防科技大学 计算机学院, 湖南 长沙 410073)
|
摘要: |
目前远监督方法被广泛应用于关系抽取任务。然而,远监督方法中存在大量错误标注现象,给远监督方法的学习效果带来了很大的影响。提出利用语义Jaccard度量关系短语与依存词间语义相似性的错误标注消除方法。消除错误标注后的训练数据用于训练模型,完成关系抽取。实验结果表明:该方法可以有效消除错误标注,提高关系抽取的性能。 |
关键词: 关系抽取 远监督 错误标注 语义相似性 |
DOI:10.11887/j.cn.201803023 |
投稿日期:2016-11-25 |
基金项目:国家自然科学基金资助项目(61472436,61532001,61303190) |
|
Reducing wrong labels in distant supervision for relation extraction |
RU Chengsen, TANG Jintao, XIE Songxian, LI Shasha, WANG Ting |
(College of Computer, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
Distant supervision has been widely used for relation extraction recently. In the distant supervision, many labels may to wrongly marked, which exerts a bad impact on relation extraction. A method to reduce wrong labels was introduced by using the semantic Jaccard to measure semantic similarity between the relation phrases and the dependency terms. The training data after reducing wrong labels was used to train the relation extractors. The experimental results show that the proposed method can effectively reduce wrong labels and improve the relation extraction performance compared with the state-of-art methods. |
Keywords: relation extraction distant supervision wrong labels semantic similarity |
|
|
|
|
|