引用本文: | 孙扬,封孝生,周城,等.一种面向混合数据集可视化的高效数据转换技术.[J].国防科技大学学报,2010,32(3):82-88.[点击复制] |
SUN Yang,FENG Xiaosheng,ZHOU Cheng,et al.An Efficient Data Transformation Technique for Mixed Data Visualization[J].Journal of National University of Defense Technology,2010,32(3):82-88[点击复制] |
|
|
|
本文已被:浏览 6857次 下载 5983次 |
一种面向混合数据集可视化的高效数据转换技术 |
孙扬, 封孝生, 周城, 汤大权, 肖卫东 |
(国防科技大学 信息系统与管理学院,湖南 长沙410073)
|
摘要: |
应用领域中存在大量多数据类型属性的混合数据集,但是,很多有效多变元可视化方法的适用范围都只局限于单一类型,对于混合数据集可视化效果不甚理想。针对包含数值及分类型属性的多元混合数据集,提出一种面向混合数据集可视化的数据转换技术,首先对每一数值型属性使用聚类技术进行分类化,然后应用对应分析算法量化所有分类型属性,最后将转换后的混合数据集使用经典的数值型可视化方法——星形坐标法进行展现,并且针对变元数量较多或分类型变元势较高的混合数据集,在数据转换过程中
提出一套降势策略,减少参与计算的变元数量,提高计算效率。实验表明,该方法对混合数据集的可视化结果不仅易于理解,而且有利于用户发现其中的隐性知识,降势策略在提高内存及时间效率方面作用显著。 |
关键词: 混合数据可视化 降势 对应分析 聚类 数据转换技术 星形坐标 |
DOI: |
投稿日期:2009-10-26 |
基金项目:国家自然科学基金资助项目(60903225);国防科技大学优秀研究生创新基金资助项目(B080503) |
|
An Efficient Data Transformation Technique for Mixed Data Visualization |
SUN Yang, FENG Xiaosheng, ZHOU Cheng, TANG Daquan, XIAO Weidong |
(College of Information System and Management, National Univ. of Defense Technology, Changsha 410073, China)
|
Abstract: |
There are abundant mixed data sets with various types of attributes in application fields. However, most multivariate data visualizations are only effective with simplex one data type. As for mixed data sets, the visualizations of them are usually dissatisfied. We present a data transformation technique for mixed data sets involving both numerical and categorical attributes. Firstly, every numerical attribute was categorized by clustering; then, all categorical attribute was quantified by Correspondence Analysis; finally, the transformed mixed data were presented in numerical data visualizations like Star Coordinates. Furthermore, aiming at those mixed data sets that have many attributes or the cardinality which is high, a set of cardinality reduction strategies were proposed to diminish the attributes number involved in computation to improve computational efficiency. Empirical studies show that the visualization of mixed data sets is easily-understandable and propitious for the user to discover the connotative information within; and that cardinality reduction strategies are highly memory-saving and time-efficient. |
Keywords: mixed data visualization cardinality reduction correspondence analysis clustering data transformation star coordinates |
|
|
|
|
|