地理研究 ›› 2008, Vol. 27 ›› Issue (3): 493-501.doi: 10.11821/yj2008030002

• 地球信息科学 • 上一篇    下一篇

基于GA-SVM封装算法的高光谱数据特征选择

卓 莉1, 郑 璟2, 王 芳1,3, 黎 夏1, 艾 彬1, 钱峻屏1   

  1. 1. 中山大学地理科学与规划学院, 广州 510275;
    2. 广东省气候中心, 广州 510080;
    3. 广州大学地理科学学院,广州510006
  • 收稿日期:2007-09-03 修回日期:2008-02-17 出版日期:2008-05-25 发布日期:2008-05-25
  • 作者简介:卓莉| 女, 博士,讲师。 主要从事资源环境遥感与地理信息系统研究 。E-mail:zhuoli@mail.sysu.edu.cn *通讯作者 :郑璟。E-mail:jingzheng2005@gmail.com
  • 基金资助:

    国家自然科学基金(40601010); 中国博士后基金(20060390208); "985工程"GIS与遥感的地学应用科技创新平台资助(105203200400006); 国家杰出青年科学基金资助项目(40525002)

A genetic algorithm based wrapper feature selection method for classification of hyper spectral data using support vector maching

ZHUO Li1, ZHENG Jing2, WANG Fang1,3, LI Xia1, AI Bin1, QIAN Jun-ping1   

  1. 1. School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China;
    2. Guang Dong Climate Center, Guangzhou 510080, China;
    3. School of Geographical Sciences, Guangzhou University, Guangzhou 510006, China
  • Received:2007-09-03 Revised:2008-02-17 Online:2008-05-25 Published:2008-05-25
  • Supported by:

    国家自然科学基金(40601010); 中国博士后基金(20060390208); "985工程"GIS与遥感的地学应用科技创新平台资助(105203200400006); 国家杰出青年科学基金资助项目(40525002)

摘要:

封装型的特征选择算法相对于过滤算法而言更有助于提高分类精度,因此在当前计算技术及效率快速发展的背景下必将成为未来之趋势。本文以支持向量机(SVM)为分类器,遗传算法(GA)为特征子集的搜索算法,构建了封装型的特征选择算法GA-SVM,并用ENVI/IDL语言编程实现,最后以HYPERION高光谱数据为例对算法予以应用。结果表明,GA-SVM算法可从196个波段中选择出13个波段,同时分类精度较不做特征选择时提高了约4%。由此可见,GA-SVM封装型特征选择算法具有较好的同时优化特征子集和SVM核函数的性能,可为当前高光谱数据的特征选择提供一个较好的算法。

关键词: 特征选择, 高光谱, 遗传算法(GA), 支持向量机(SVM)

Abstract:

The high-dimensional feature vectors of hyper spectral data often impose a high computational cost as well as the risk of "over fitting" when classification is performed. Therefore it is necessary to reduce the dimensionality through ways like feature selection. Currently, there are two kinds of feature selection methods: filter methods and wrapper methods. The former kind requires no feedback from classifiers and estimates the classification performance indirectly. The latter kind evaluates the "goodness" of selected feature subset directly based on the classification accuracy. Many experimental results have proved that the wrapper methods can yield better performance, although they have the disadvantage of high computational cost. In this paper, we present a Genetic Algorithm (GA) based wrapper method for classification of hyper spectral data using Support Vector Machine (SVM), a state-of-art classifier that has found to be success in a variety of areas. The genetic algorithm (GA), which seeks to solve optimization problems using the methods of evolution, specifically survival of the fittest, was used to optimize both the feature subset, i.e. band subset, of hyper spectral data and SVM kernel parameters simultaneously. A special strategy was adopted to reduce computation cost caused by the high-dimensional feature vectors of hyper spectral data when the feature subset part of chromosome was designed. The GA-SVM method was realized using the ENVI/IDL language, and was then tested by applying a HYPERION hyper spectral image. Comparison of the optimized results and the un-optimized results showed that the GA-SVM method could significantly reduce the computation cost while improving the classification accuracy. The number of bands used for classification was reduced from 198 to 13, while the classification accuracy increased from 88.81% to 92.51%. The optimized values of the two SVM kernel parameters were 95.0297 and 0.2021, respectively, which were different from the default values as used in the ENVI software. In conclusion, the proposed wrapper feature selection method GA-SVM can optimize feature subsets and SVM kernel parameters at the same time, therefore can be applied in feature selection of the hyper spectral data.

Key words: feature selection, hyperspectral, genetic algorithm, supported vector machine