地理研究 ›› 2012, Vol. 31 ›› Issue (8): 1411-1421.doi: 10.11821/yj2012080006

• 经济与区域发展 • 上一篇    下一篇

空间多维经济统计数据的降维方法——以四川省经济统计数据为例

董承玮1,2, 芮小平1,3, 邓羽4,5,6, 关兴良4,5   

  1. 1. 中国科学院研究生院资源与环境学院, 北京 100049;
    2. 北京市测绘设计研究院, 北京 100038;
    3. 中国科学院生态环境研究中心, 北京 100085;
    4. 中国科学院地理科学与资源研究所, 北京 100101;
    5. 中国科学院研究生院, 北京 100039;
    6. 哈佛大学, 美国坎布里奇 02138
  • 收稿日期:2011-07-15 修回日期:2012-02-25 出版日期:2012-08-20 发布日期:2012-08-20
  • 通讯作者: 芮小平(1975- ),男,江苏苏州人,副教授,主要从事地理信息系统理论与应用方面的研究。E-mail:ruixp@gucas.ac.cn E-mail:ruixp@gucas.ac.cn
  • 作者简介:董承玮(1984- ),男,湖南衡阳人,硕士,研究方向为三维GIS、多维空间信息可视化。E-mail:dongchengwei08@mails.gucas.ac.cn
  • 基金资助:

    国家自然科学基金项目(40901191)

Study on dimension-reduction of spatial economic statistics:A case study of economic statistial data of Sichuan

DONG Cheng-wei1,2, RUI Xiao-ping1,3, DENG Yu4,5,6, GUAN Xing-liang4,5   

  1. 1. College of Resources and Environment, Graduate University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Beijing Institute of Surveying and Mapping, Beijing 100038, China;
    3. Research Center for Eco-Environmental Sciences, CAS, Beijing 100085, China;
    4. Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China;
    5. Graduate university of Chinese Academy of Sciences, Beijing 100049, China;
    6. Harvard University, Cambridge 02138, USA
  • Received:2011-07-15 Revised:2012-02-25 Online:2012-08-20 Published:2012-08-20

摘要: 经济统计信息往往包含多维属性,需要采用降维方法将多维信息转换到三维以内的空间来实现多维信息可视化,这有助于研究其内在空间分布规律。在评价线性方法 (PCA)、非线性方法 (NLM和SOFM),以及监督分类方法 (SVM)等四种降维方法的基础上,以2007年四川省区县尺度为研究单元,运用不同分类方法针对区县社会经济发展现状进行聚类(分类)处理,并对成果的差异性展开了深入讨论,主要结论如下:PCA虽然能在整体上揭示经济发展趋势,但结果与实际情况差异较大;NLM能很好地展现出四川经济发展的区域态势和核心区域,准确反映了四川经济发展现状;SOFM的分类结果与发展现状较吻合,但局部地区存在一定的错分情况,且不能进行类内目标的比较;SVM是监督分类,需要已知样本来训练分类过程,在样本的选择上存在较大的主观性,且最优参数的搜索过程较为复杂。本文对几种降维方法的比较,并在经济统计领域中的应用,可以为相关的空间多维信息降维研究提供参考。

关键词: 降维, 多维可视化, 经济统计数据, 四川

Abstract: There are more than three attributes in economic statistical data generally.When studying the inherent structural characteristics of these data such as clustering and distribution,researchers need to reduce multi-dimensional information to three-dimensional space or less to achieve multi-dimensional visualization.There are multi-dimensional reduction methods,whose results are different from each other because of different mathematics theories and application ranges,and the visualization results of these methods will vary.So evaluation of different methods can provide important references for the selection of methods in different areas.In the paper,the authors analyze economic statistical data of Sichuan province in 2007 based on county-unit by implementing four commonly used algorithms: the linear method PCA,nonlinear method NLM and SOFM,and a supervised classification method SVM,then obtain a series of classification results.Considering the status of economic development in Sichuan,the authors analyze the differences between the results of these methods,and draw some conclusions as follows.Although PCA can reveal the overall development trend,the result is not consistent with the real condition in Sichuan;NLM can well show the regional trend and core areas of economic development in Sichuan,and account for the development status;SOFM can also show the development status,but there are several classification errors in the northeastern part of the region.It is impossible for comparison within each cluster;as a supervised method,SVM needs a known sample set to train the classification process,which makes the sample selection subjective,and the search process for optimal parameters is complicated.The comparison of these methods and their application in economic statistics fields can provide a reference for the future relevant spatial dimension-reduction research.

Key words: dimension-reduction, multi-dimensional visualization, economic statistics data, Sichuan