引用本文: | 张剑波,夏灯城,赵加奥,等.分布式计算环境下的栅格数据存储策略.[J].国防科技大学学报,2017,39(6):51-58.[点击复制] |
ZHANG Jianbo,XIA Dengcheng,ZHAO Jiaao,et al.Storage strategy of raster data under the distributed computing environment[J].Journal of National University of Defense Technology,2017,39(6):51-58[点击复制] |
|
|
|
本文已被:浏览 7281次 下载 6961次 |
分布式计算环境下的栅格数据存储策略 |
张剑波, 夏灯城, 赵加奥, 李谢清, 崔永键, 袁国斌 |
(中国地质大学 信息工程学院, 湖北 武汉 430074)
|
摘要: |
针对传统的栅格数据存储策略不能满足分布式计算环境下粗粒度数据访问需求,应对海量栅格数据计算时效率低下的问题,结合分布式文件系统的存储特点,同时考虑地图代数算子在Map/Reduce阶段以栅格瓦片为单位的计算特点,提出一种基于Hadoop分布式文件系统的栅格瓦片存储策略。围绕栅格数据瓦片分割、压缩瓦片数据组织与存储、分布式文件输入输出接口改进等方面对该存储策略加以实现,并使用基于该存储策略的地图代数局部算子的分布式计算流程加以验证。理论分析与实验结果表明,该策略能够显著提高分布式计算环境下空间分析算子的运算速度。 |
关键词: 分布式计算 栅格数据 存储策略 地图代数 |
DOI:10.11887/j.cn.201706009 |
投稿日期:2016-07-13 |
基金项目:国家自然科学基金资助项目(41001225,41501584) |
|
Storage strategy of raster data under the distributed computing environment |
ZHANG Jianbo, XIA Dengcheng, ZHAO Jiaao, LI Xieqing, CUI Yongjian, YUAN Guobin |
(Faculty of Information Engineering, China University of Geosciences, Wuhan 430074, China)
|
Abstract: |
Traditional storage strategy of raster data cannot meet the demands of coarse-grained data processing under the distributed computing environment and has low efficiency when dealing with calculations for gigantic raster data. A storage strategy of raster tile data was presented on the basis of the storage characteristics of distributed file system. It also took the calculation characteristics of spatial analysis operators of map algebra into consideration, which uses raster tile as processing unit during map and reduce stage. The storage strategy was implemented by the following steps. Firstly raster data were divided into raster tiles. Then these tiles were compressed and organized by a special sequence in order to be transferred to Hadoop distributed file system. Finally input and output file interfaces were re-implemented to meet the data access requirements of map and reduce stage. The strategy was tested and verified by the distributed calculation process of local map algebra operators. Theoretical analysis and experimental results show that this strategy can significantly improve the processing speed of space analysis operators. |
Keywords: distributed computing raster data storage strategy map algebra |
|
|