基于用户相关反馈的排序学习算法研究
蔡 飞
国防科技大学 信息系统工程重点实验室,湖南 长沙 410073,caifei@nudt.edu.cn
陈洪辉
国防科技大学 信息系统工程重点实验室,湖南 长沙 410073
舒 振
国防科技大学 信息系统工程重点实验室,湖南 长沙 410073
摘要:

在信息检索中,系统需要根据用户查询将文档按照相似度大小进行排序,吸引了众多信息检索和机器学习领域研究者的眼球,并形成了诸多排序算法模型。然而并未考虑到查询短语与文档构成的特征对与用户相关反馈之间存在的同质性。在机器学习算法基础上,通过提取训练样本的主要特征进行有效聚类,并结合用户的相关反馈获取各个类中相关度判断的置信值,形成相似度判定模型,应用该模型来对测试样本进行相关度排序。算法对LETOR数据集进行了测试,实验表明,信息检索性能指标比其他排序算法有了进一步提高,并且无需复杂的数据预处理工作和手动设定算法参数。

基金项目:

国家自然科学基金资助项目(61070216)

Learning to rank based on user relevance feedback
CAI Fei
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China,caifei@nudt.edu.cn
CHEN Honghui
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
SHU Zhen
Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
Abstract:

Many information retrieval applications have to present their results in the form of ranked lists, in which documents must be sorted in a descending order according to their relevance to a given query. This has led the interest of the information retrieval community in methods that automatically learn effective ranking models, and recently machine learning techniques have also been applied to model construction. Most of the existing methods do not take into consideration the fact that significant homogeneity exists between query-document pairs related to user’s feedback. In this research, a novel method which clusters patterns in the training data with their relevance from the user, and then uses the discovered rules to rank documents at query-time. A systematic evaluation of the proposed method using the LETOR benchmark dataset is posposed. The experimental results show that the proposed method outperforms the state-of-the-art methods with no need of time-consuming and laborious pre-processing.


【下载PDF全文】  
相似文献(共20条):关闭