引用本文: | 敖富江,杜静,陈彬,等.一种基于混合搜索的高效Top-K最频繁模式挖掘算法.[J].国防科技大学学报,2009,31(2):90-93.[点击复制] |
AO FuJiang,DU Jing,CHEN Bin,et al.An Efficient Mixed-searching-based Algorithm for Mining Top-K Most-frequent Patterns[J].Journal of National University of Defense Technology,2009,31(2):90-93[点击复制] |
|
|
|
本文已被:浏览 6598次 下载 5501次 |
一种基于混合搜索的高效Top-K最频繁模式挖掘算法 |
敖富江1, 杜静2, 陈彬1, 黄柯棣1 |
(1.国防科技大学 机电工程与自动化学院,湖南 长沙 410073;2.国防科技大学 计算机学院,湖南 长沙 410073)
|
摘要: |
挖掘数据集中的Top-K最频繁模式具有重要意义。已有Top-K最频繁模式挖掘算法通常采用最频繁的k个项目作为初始项目,并将初始项目中频率最低的项目的支持度作为初始边界支持度。但实际组成Top-K最频繁模式的项目数目可能远少于k,从而制约了算法的效率。为此,提出了一种基于混合搜索方式的高效Top-K最频繁模式挖掘算法MTKFP。该算法首先利用宽度优先搜索获得少量的短项集,并利用短项集确定数目少于k的初始项目范围以及较高的初始边界支持度;然后利用深度优先搜索获得所有Top-K最频繁模式。实验表明,MTKFP算法所获得的初始项目数目至少低于已有算法70%,初始边界支持度高于已有算法;MTKFP算法的性能优于已有最好算 法。 |
关键词: Top-K最频繁模式 边界支持度 混合搜索 FP-Tree |
DOI: |
投稿日期:2008-09-18 |
基金项目:国家自然科学基金资助项目(60573057;60704038) |
|
An Efficient Mixed-searching-based Algorithm for Mining Top-K Most-frequent Patterns |
AO FuJiang1, DU Jing2, CHEN Bin1, HUANG KeDi1 |
(1.College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;2.College of Computer, National Univ. of Defense Technology, Changsha 410073, China)
|
Abstract: |
It is significant to mine Top-K most-frequent patterns in dataset. The existing algorithms usually use the k-most frequent items as the initial items, and use the support of item with lowest frequency in initial items as the initial border support. In fact, since the number of items in Top-K most-frequent patterns is much less than k, the efficiency of the existing algorithms is restricted. To solve this problem, an efficient mixed-searching based algorithm for mining Top-K most-frequent patterns, MTKFP is presented. The algorithm firstly mines some short item sets by breadth-first searching, and uses short item sets to obtain the scope of the initial items (the number of initial items is less than k) and the higher initial border support; then it obtains all Top-K most-frequent patterns by depth-first searching. The experimental results show that the number of initial items of MTKFP is 70% lower than that of existing algorithms, and the initial border support of MTKFP is higher than that of existing algorithms. Hence the performance of MTKFP is superior to that of the best existing algorithm. |
Keywords: Top-K most-frequent patterns border support mixed searching FP-Tree |
|
|
|
|
|