GEOGRAPHICAL RESEARCH ›› 2018, Vol. 37 ›› Issue (3): 635-646.doi: 10.11821/dlyj201803014

• Articles • Previous Articles    

Selection of environmental variables and their scales in multiple soil properties mapping: A case study in Heilongjiang Heshan Farm

Jingjing SHI1,2(), Lin YANG1,3(), Canying ZENG4, Axing ZHU1,4,5,6, Chengzhi QIN1,2, Peng LIANG1,2   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. School of Geographic and Oceanographic Sciences, Nanjing University, Nanjing 210023, China
    4. School of Geography, Nanjing Normal University, Nanjing 210023, China
    5. Key Laboratory of Virtual Geographic Environment, Nanjing Normal University, Ministry of Education; State Key Laboratory Cultivation Base of Geographical Environment Evolution, Jiangsu Province; Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
    6. Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA
  • Received:2017-09-26 Revised:2017-12-19 Online:2018-03-15 Published:2018-04-25
  • About author:

    Author: Shi Zhenqin (1988-), PhD, specialized in regional development and land space management in mountain areas. E-mail:

    *Corresponding author: Deng Wei (1957-), Professor, specialized in mountain environment and regional development.



Studying the relevant environmental variables with consideration of scales for different soil properties is meaningful to understand the generation and development of soil properties, and also necessary in multiple soil properties mapping and sampling. This study explored multiple soil properties' relevant environmental variables and their scales, and examined the impact of different environmental variables and their scales on the prediction of different soil properties. Our study area is Heshan Farm, and the target soil properties are topsoil clay content, sand content, silt content, topsoil organic matter content (SOM), and soil depth. One hundred and seventy-three multi-scale terrain variables were generated by changing neighborhood size for calculation. The single scale and multi-scale variables were ranked according to their variable importance calculated by Random Forest. Subsets 1 and 2 were selected from single scale and multi-scale variables respectively based on their variable importance with elimination of multi-collinearity. Subset 3 was taken as a reference subset and selected based on the expert knowledge. The selected subset 1 had little common with subset 3. This indicates that the environmental variables selected based on expert knowledge may be not the most important variables for the soil properties. Subset 2 had a high overlap with subset 3 though the scales were different for different environmental variables and soil properties. For the case of soil sand and silt, their relevant variables and scales were similar but quite different from soil clay's, and the SOM and soil depth had similar relevant variables. The mapping results based on the three subsets showed that using environmental variables in subset 1 was more accurate than using environmental variables in subset 3 for all soil properties except for sand content, the improvements of mean RMSEs were 1.8%~13.1%. Using environmental variables in subset 2 was more accurate than using environmental variables in subsets 1 and 3 for all the five soil properties, the improvements of mean RMSEs were 8.7%~16.5% and 7.8%~21.3%. It was shown that using reference variables with proper scales is more important than using top-ranked single scale variables for mapping.

Key words: soil property mapping, environmental variables, random forest, multi-scale