引用本文: | 贺玲,吴玲达,蔡益朝,等.多媒体数据挖掘中数据间的相似性度量研究.[J].国防科技大学学报,2006,28(1):77-80.[点击复制] |
HE Ling,WU Lingda,CAI Yichao,et al.Research on Similarity Measurement in Multimedia Data Mining[J].Journal of National University of Defense Technology,2006,28(1):77-80[点击复制] |
|
|
|
本文已被:浏览 6683次 下载 5968次 |
多媒体数据挖掘中数据间的相似性度量研究 |
贺玲, 吴玲达, 蔡益朝, 谢毓湘, 雷震 |
(国防科技大学 信息系统与管理学院,湖南 长沙 410073)
|
摘要: |
聚类是多媒体数据挖掘的重要任务之一,数据之间的相似性度量是聚类的基础和前提。多媒体数据的特征矢量通常都是数十维甚至数百维的,但传统的相似度量方式一般只适用于低维数据。在分析高维数据特性的基础上,提出了一个新的度量方式。通过使用一个特定的策略对原始数据空间进行网格划分,该方法较好地避免了噪声数据对高维数据相似性度量的影响。实验证明此方法是有效的。 |
关键词: 多媒体数据挖掘 维度灾难 相似度量 |
DOI: |
投稿日期:2005-10-20 |
基金项目:国家自然科学基金资助项目(60473117) |
|
Research on Similarity Measurement in Multimedia Data Mining |
HE Ling, WU Lingda, CAI Yichao, XIE Yuxiang, LEI Zhen |
(College of Information System and Management, National Univ. of Defense Technology, Changsha 410073, China)
|
Abstract: |
Clustering is one of the focused problems in multimedia data mining, and similarity measurement among data is fundamental to clustering. In multimedia data clustering, the corresponding vector features are always of high dimensionality. Most traditional measurement methods, however, are only efficient for low dimensional data. This paper, based on an analysis of general characteristics of data presented in high dimensional spaces, proposes a new similarity measurement for multimedia data mining. It used a special strategy to split the original data space before computing the similarity among data points, thus efficiently avoiding the influence of noisy data in high dimensional dimensional spaces. Experiments show that the new method presented is effective. |
Keywords: multimedia data mining curse of dimensionality similarity measurement |
|
|
|
|
|