1. 中国科学院地理科学与资源研究所,资源与环境信息系统国家重点实验室,北京 100101 2. 中国科学院大学,北京 100049 3. 南京大学地理与海洋科学学院,南京 210023 4. 南京师范大学地理科学学院,南京 210023 5. 南京师范大学虚拟地理环境教育部重点实验室,江苏省地理环境演化国家重点实验室培育建设点,江苏省地理信息资源开发与利用协同创新中心,南京 210023 6. Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA
Selection of environmental variables and their scales in multiple soil properties mapping: A case study in Heilongjiang Heshan Farm
SHI Jingjing1,2(),YANG Lin1,3(),ZENG Canying4,ZHU Axing1,4,5,6,QIN Chengzhi1,2,LIANG Peng1,2
1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China 2. University of Chinese Academy of Sciences, Beijing 100049, China 3. School of Geographic and Oceanographic Sciences, Nanjing University, Nanjing 210023, China 4. School of Geography, Nanjing Normal University, Nanjing 210023, China 5. Key Laboratory of Virtual Geographic Environment, Nanjing Normal University, Ministry of Education; State Key Laboratory Cultivation Base of Geographical Environment Evolution, Jiangsu Province; Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China 6. Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA
Studying the relevant environmental variables with consideration of scales for different soil properties is meaningful to understand the generation and development of soil properties, and also necessary in multiple soil properties mapping and sampling. This study explored multiple soil properties' relevant environmental variables and their scales, and examined the impact of different environmental variables and their scales on the prediction of different soil properties. Our study area is Heshan Farm, and the target soil properties are topsoil clay content, sand content, silt content, topsoil organic matter content (SOM), and soil depth. One hundred and seventy-three multi-scale terrain variables were generated by changing neighborhood size for calculation. The single scale and multi-scale variables were ranked according to their variable importance calculated by Random Forest. Subsets 1 and 2 were selected from single scale and multi-scale variables respectively based on their variable importance with elimination of multi-collinearity. Subset 3 was taken as a reference subset and selected based on the expert knowledge. The selected subset 1 had little common with subset 3. This indicates that the environmental variables selected based on expert knowledge may be not the most important variables for the soil properties. Subset 2 had a high overlap with subset 3 though the scales were different for different environmental variables and soil properties. For the case of soil sand and silt, their relevant variables and scales were similar but quite different from soil clay's, and the SOM and soil depth had similar relevant variables. The mapping results based on the three subsets showed that using environmental variables in subset 1 was more accurate than using environmental variables in subset 3 for all soil properties except for sand content, the improvements of mean RMSEs were 1.8%~13.1%. Using environmental variables in subset 2 was more accurate than using environmental variables in subsets 1 and 3 for all the five soil properties, the improvements of mean RMSEs were 8.7%~16.5% and 7.8%~21.3%. It was shown that using reference variables with proper scales is more important than using top-ranked single scale variables for mapping.
. 土壤制图中多目标属性的环境因子及其尺度选择——以黑龙江鹤山农场为例[J]. 地理研究,
2018, 37(3): 635-646.
ZENG Canying et al
. Selection of environmental variables and their scales in multiple soil properties mapping: A case study in Heilongjiang Heshan Farm[J]. GEOGRAPHICAL RESEARCH,
2018, 37(3): 635-646.
McBratney AB, Odeh I OA, Bishop T FA, et al. An overview of pedometric techniques for use in soil survey. , 2000, 97: 293-327.http://linkinghub.elsevier.com/retrieve/pii/S0016706100000434
Quantitative techniques for spatial prediction in soil survey are developing apace. They generally derive from geostatistics and modern statistics. The recent developments in geostatistics are reviewed particularly with respect to non-linear methods and the use of all types of ancillary information. Additionally analysis based on non-stationarity of a variable and the use of ancillary information are demonstrated as encompassing modern regression techniques, including generalised linear models (GLM), generalised additive models (GAM), classification and regression trees (RT) and neural networks (NN). Three resolutions of interest are discussed. Case studies are used to illustrate different pedometric techniques, and a variety of ancillary data. The case studies focus on predicting different soil properties and classifying soil in an area into soil classes defined a priori. Different techniques produced different error of interpolation. Hybrid methods such as CLORPT with geostatistics offer powerful spatial prediction methods, especially up to the catchment and regional extent. It is shown that the use of each pedometric technique depends on the purpose of the survey and the accuracy required of the final product.
Miller BA, KoszinskiS, HieroldW, et al.Towards mapping soil carbon landscapes: Issues of sampling scale and transferability. , 2015, 156: 194-208.http://www.sciencedirect.com/science/article/pii/S016719871500152X
The conversion of point observations to a geographic field is a necessary step in soil mapping. For pursuing goals of mapping soil carbon at the landscape scale, the relationships between sampling scale, representation of spatial variation, and accuracy of estimated error need to be considered. This study examines the spatial patterns and accuracy of predictions made by different spatial modelling methods on sample sets taken at two different scales. These spatial models are then tested on independent validation sets taken at three different scales. Each spatial modelling method produced similar, but unique, maps of soil organic carbon content (SOC%). Kriging approaches excelled at internal spatial prediction with more densely spaced sample points. Because kriging depends on spatial autocorrelation, kriging performance was naturally poor in areas of spatial extrapolation. In contrast, the spatial regression approaches tested could continue to perform well in spatial extrapolation areas. However, the problem of induction allowed the potential for problems in some areas, which was less predictable. This problem also existed for the kriging approaches. Spatial phenomena occurring between sampling points could also be missed by kriging models. Use of covariates with kriging can help, but the requirement of capturing the full feature space in the map remains. Methods that utilize spatial association, such as spatial regression, can map soil properties for landscape scales at a high resolution, but are highly dependent on the inclusion of the full attribute space in the calibration of the model and the availability of transferable covariates.
[LiRunkui, PengMing, YasuyukiKono, et al.Soil depth mapping in lithoidal mountainous area using stratified strategy and fuzzy logic: A case study in Hushiha watershed, North China. , 2013, 32(5): 965-973.]
[ChenShi, GaoChao, XuBin, et al.Quantitative inversion of soil salinity and analysis of its spatial pattern in agricultural area in Shihezi of Xinjiang. , 2014, 33(11): 2135-2144.]
Odeha I OA, McBratney AB, Chittleborough DJ. Spatial prediction of soil properties from landform attributes derived from a digital elevation model. , 1994, 63(3-4): 197-214.http://linkinghub.elsevier.com/retrieve/pii/0016706194900639
ABSTRACT Digital elevation models (DEMs) provide a good way of deriving landform attributes that may be used for soil prediction. The geostatistical techniques of kriging and cokriging are increasingly being applied to predicting soil properties. Whereas ordinary kriging (and universal kriging) utilise spatial correlation to determine the coefficients of the linear predictor, cokriging involves both inter-variable correlation and spatial covariation among variables. Multi-linear regression modelling also offers an alternative to predicting a soil variable by means of covariation. The performance of predicting four soil variables by these methods and two regression-kriging models are compared. The precision and bias of prediction of the six methods were dependent on the soil variable predicted. The mean error of prediction indicates reasonably small bias of prediction for all the soil variables by almost all of the methods. With the exception of topsoil gravel, for which multi-linear regression performed best, the root mean square error showed the two regression-kriging procedures to be best. Further analysis based on the mean ranks of performance by the methods confirmed this. All the kriging methods involving covariables (landform attributes) have a more smoothing effect on the predicted values, thus minimising the influence of outliers on prediction performance. Both the methods of regression-kriging show promise for predicting sparsely located soil properties from dense observations of landform attributes derived from the DEM. Histograms of subsoil clay residuals show outliers in the data set. These outliers are more evident in multi-linear regression, ordinary kriging and universal kriging than regression-kriging. There was a clear advantage in using the regression-kriging methods on those variables which had a small correlation with the landform attributes: root mean square errors for all the soil variables are much smaller than those resulting from any of the multi-linear regression, ordinary kriging, universal kriging or cokriging methods.
Zhu AX, QiF, MooreA, et al.Prediction of soil properties using fuzzy membership values. , 2010b, 158: 199-206.http://linkinghub.elsevier.com/retrieve/pii/S0016706110001497
Detailed information on the spatial variation of soils is desirable for many agricultural and environmental applications. This research explores three approaches that use soil fuzzy membership values to predict detailed spatial variation of soil properties. The first two are weighted average models with which the soil property value at a location is the average of the typical soil property values of the soil types weighted by fuzzy membership values. We compared two options to determine the typical property values: one that uses the representative values from existing soil survey and the other that uses the property value of a field observation typical of a soil type. The third approach is a multiple linear regression in which the soil property value at a location is predicted using a regression between the soil property and fuzzy membership values. We compared this to multiple linear regression with environmental variables. In a case study in the Driftless Area of Wisconsin, the models were also compared with a predictive model based on existing soil survey. The results showed that regression with environmental variables works well for areas where the soil-搕errain relationship is relatively simple but regression with fuzzy membership values is an improvement for areas where soil-搕errain relationships are more complicated. From the perspectives of data requirement and model simplicity as well as accuracy of prediction the weighted average with maximum fuzzy membership option has obvious advantages.
[YangLin, ZhuAxing, QinChengzhi, et al.A purposive sampling design method based on typical points and its application in soil mapping. , 2010, 29(3): 279-286.]
PrioriS, FantappièM, BianconiN, et al.Field-scale mapping of soil carbon stock with limited sampling by coupling Gamma-Ray and Vis-NIR Spectroscopy. , 2016, 80: 954-964.https://dl.sciencesocieties.org/publications/sssaj/abstracts/80/4/954
Abstract High-precision mapping of important soil services, such as soil organic C stocks, is basic for monitoring the effects of different soil management regimes and the effectiveness of agricultural policies. Proximal soil sensing methods have been often used in the last decades to limit costs, field work, and time and to obtain reliable and accurate maps. We tested the combined use of two proximal sensors, visible-near-infrared (Vis-NIR) and passive γ-ray spectrometers, to obtain highly detailed maps of C stocks of the topsoil (CS30' 0-30 cm) of nine pairs of fields in western Sicily using a limited number of sampling sites per field for traditional laboratory analysis (about one sample per hectare). Laboratory Vis-NIR diffuse reflectance spectroscopy allowed the number of data points per field to be increased, at the same time reducing the costs for laboratory analysis. The predictive model had a coefficient of determination (R05) of 0.77 and an error (RMSE) of 0.67 kg m6305. Data points predicted by Vis-NIR on the fine earth (<2 mm) and corrected for gravel content (CS30pred) were interpolated within each field using geographically weighted multiple regression and two sets of covariates: (i) digital elevation model derivatives, such as elevation, slope, plan and profile curvature, and topographic wetness index; and (ii) elevation and γ-ray total counts maps. Validation of 36 independent data points showed that the second method provided greater accuracy than the first. In particular, residual prediction deviation (RPD) showed a mean value of 2.19; however, three pairs of fields showed high error and low RPD. This methodology provides a cost-effective tool to interpolate C stocks within arable fields, limiting laboratory analysis. The accuracy of the CS30pred maps allows monitoring of the effects of agricultural management and/or soil erosion on the soil C pool. 08 Soil Science Society of America, 5585 Guilford Rd., Madison WI 53711 USA. All Rights reserved.
Miller BA, KoszinskiS, WehrhanM, et al. Impact of multi-scale predictor selection for modeling soil properties. , 2015, 239-240: 97-106.http://linkinghub.elsevier.com/retrieve/pii/S0016706114003504
61Potentially useful predictors for digital soil mapping are often overlooked.61Different analysis scales should be treated as unique predictor variables.61The use of multi-scale predictor variables can greatly increase model performance.61Experimentation with subsets of predictor pools for data mining tools can be productive.
McKenzie NJ, Ryan PJ. Spatial prediction of soil properties using environmental correlation. , 1999, 89: 67-94.http://linkinghub.elsevier.com/retrieve/pii/S0016706198001372
Conventional survey methods have efficiencies in medium to low intensity survey because they use relationships between soil properties and more readily observable environmental features as a basis for mapping. However, the implicit predictive models are qualitative, complex and rarely communicated in a clear manner. The possibility of developing an explicit analogue of conventional survey practice suited to medium to low intensity surveys is considered. A key feature is the use of quantitative environmental variables from digital terrain analysis and airborne gamma radiometric remote sensing to predict the spatial distribution of soil properties. The use of these technologies for quantitative soil survey is illustrated using an example from the Bago and Maragle State Forests in southeastern Australia. A design-based, stratified, two-stage sampling scheme was adopted for the 50,000 ha area using digital geology, landform and climate as stratifying variables. The landform and climate variables were generated using a high resolution digital elevation model with a grid size of 25 m. Site and soil data were obtained from 165 sites. Regression trees and generalised linear models were then used to generate spatial predictions of soil properties using digital terrain and gamma radiometric survey data as explanatory variables. The resulting environmental correlation models generate spatial predictions with a fine grain unmatched by comparable conventional survey methods. Example models and spatial predictions are presented for soil profile depth, total phosphorus and total carbon. The models account for 42%, 78% and 54% of the variance present in the sample respectively. The role of spatial dependence, issues of scale and landscape complexity are discussed along with the capture of expert knowledge. It is suggested that environmental correlation models may form a useful trend model for various forms of kriging if spatial dependence is evident in the residuals of the model.
GeY, Wang JH, Heuvelink G B M, et al. Sampling design optimization of a wireless sensor network for monitoring ecohydrological processes in the Babao River Basin, China. , 2015, 29(1): 92-110.http://www.tandfonline.com/doi/abs/10.1080/13658816.2014.948446
Optimal selection of observation locations is an essential task in designing an effective ecohydrological process monitoring network, which provides information on ecohydrological variables by capturing their spatial variation and distribution. This article presents a geostatistical method for multivariate sampling design optimization, using a universal cokriging (UCK) model. The approach is illustrated by the design of a wireless sensor network (WSN) for monitoring three ecohydrological variables (land surface temperature, precipitation and soil moisture) in the Babao River basin of China. After removal of spatial trends in the target variables by multiple linear regression, variograms and cross-variograms of regression residuals are fit with the linear model of coregionalization. Using weighted mean UCK variance as the objective function, the optimal sampling design is obtained using a spatially simulated annealing algorithm. The results demonstrate that the UCK model-based sampling method can consider the relationship of target variables and environmental covariates, and spatial auto- and cross-correlation of regression residuals, to obtain the optimal design in geographic space and attribute space simultaneously. Compared with a sampling design without consideration of the multivariate (cross-)correlation and spatial trend, the proposed sampling method reduces prediction error variance. The optimized WSN design is efficient in capturing spatial variation of the target variables and for monitoring ecohydrological processes in the Babao River basin.
BehrensT, SchmidtK, Ramirez-LopezL, et al.Hyper-scale digital soil mapping and soil formation analysis. , 2014, 213: 578-588.http://linkinghub.elsevier.com/retrieve/pii/S0016706113002759
Landscape characteristics show local, regional and supra-regional components. As a result pedogenesis and the spatial distribution of soil properties are both influenced by features emerging at multiple scales. To account for this effect in a predictive model, descriptors of the geomorphic signature are required at multiple scales. In this study, we present a new hyper-scale terrain analysis approach, referred to as Contextual Statistical Mapping (ConStat), which is based on statistical neighborhood measures derived for growing sparse circular neighborhoods. The statistical measures tested comprise basic descriptors such as the minimum, maximum, mean, standard deviation, and skewness, as well as statistical terrain attributes and directional components. We propose a data mining framework to determine the relevant statistical measures at the relevant scales to analyze and interpret the influence of these statistical measures and to map the geomorphic structures influencing soil formation and the regions where a statistical measure shows influence. We introduce ConStat on two landscape-scale DSM examples with different soil genesis regimes where the ConStat terrain features serve as proxies for multi-scale variations of climate and parent material conditions. The results show that ConStat provides high predictive power. The cross-validated R2 values range from 0.63 for predicting topsoil clay content in the Piracicaba area (Brazil) to 0.68 for topsoil silt content in the Rhine-Hesse area (Germany). The results obtained from data mining analysis allow for interpretations beyond conventional concepts and approaches to explain soil formation. As such it overcomes the trade-off between accuracy and interpretability of soil property predictions.
Smith MP, Zhu AX, Burt JE, et al.The effects of DEM resolution and neighborhood size on digital soil survey. , 2006, 137: 58-69.http://linkinghub.elsevier.com/retrieve/pii/S001670610600231X
Terrain characteristics, such as slope gradient, slope aspect, profile curvature, contour curvature computed from digital elevation model (DEM), are among the key inputs to digital soil surveys based on geographic information systems (GIS). These terrain attributes are computed over a neighborhood (spatial extent). The objective of this research was to investigate the combined effect of DEM resolution and neighborhood size on digital soil surveys using the Soil– Landscape Inference Model (SoLIM) approach. The effect of neighborhood size and DEM resolution on digital soil survey was examined through computing the required terrain attributes using different neighborhood sizes (from 3 to 5402m) for 3, 6, 9, 12, 18, and 2702m resolution DEM. These attributes were then compiled and used to digitally map soils using the SoLIM approach. Field work completed on a hillslope in Dane County, WI in the summer of 2003 was used to validate each of the SoLIM derived soil surveys for accuracy. The results of the soil survey validations suggest that there is a range of neighborhood sizes that produces the most accurate results for a given resolution DEM. This range of neighborhood sizes, however, varies from landscape to landscape. When the soils on a gently rolling landscape were mapped, the neighborhood sizes that produced the most accurate results ranged from about 33–4802m. When soils on short, steep backslope positions were mapped, the neighborhood size values that produced the most accurate results range from about 24–3602m. This paper also shows that it is not always the highest resolution DEM that produces the highest accuracy. Knowing which DEM resolution and neighborhood size combinations produce the most accurate digital soil surveys for a particular landscape will be extremely useful to users of GIS-based soil-mapping applications.
BehrensT, Zhu AX, SchmidtK, et al.Multi-scale digital terrain analysis and feature selection for digital soil mapping. , 2010, 155: 175-185.http://linkinghub.elsevier.com/retrieve/pii/S0016706109002298
Terrain attributes are the most widely used predictors in digital soil mapping. Nevertheless, discussion of techniques for addressing scale issues and feature selection has been limited. Therefore, we provide a framework for incorporating multi-scale concepts into digital soil mapping and for evaluating these scale effects. Furthermore, soil formation and soil-forming factors vary and respond at different scales. The spatial data mining approach presented here helps to identify both the scale which is important for mapping soil classes and the predictive power of different terrain attributes at different scales. The multi-scale digital terrain analysis approach is based on multiple local average filters with filter sizes ranging from 3 脳 3 up to 31 脳 31 pixels. We used a 20-m DEM and a 1:50 000 soil map for this study. The feature space is extended to include the terrain conditions measured at different scales, which results in highly correlated features (terrain attributes). Techniques to condense the feature space are therefore used in order to extract the relevant soil forming features and scales. The prediction results, which are based on a robust classification tree (CRUISE) show that the spatial pattern of particular soil classes varies at characteristic scales in response to particular terrain attributes. It is shown that some soil classes are more prevalent at one scale than at other scales and more related to some terrain attributes than to others. Furthermore, the most computationally efficient ANOVA-based feature selection approach is competitive in terms of prediction accuracy and the interpretation of the condensed datasets. Finally, we conclude that multi-scale as well as feature selection approaches deserve more research so that digital soil mapping techniques are applied in a proper spatial context and better prediction accuracy can be achieved.
Roecker SM, Thompson JA.Scale effects on terrain attribute calculation and their use as environmental covariates for digital soil mapping. In: Boettinger J L, Howell D W, Moore A C, et al. , 2010: 55-66.http://link.springer.com/10.1007/978-90-481-8863-5_5
The digital representation of the Earth-檚 surface by terrain attributes is largely dependent on the scale at which they are computed. Typically the effects of scale on terrain attributes have only been investigated as a function of digital elevation model (DEM) grid size, rather than the neighborhood size over which they are computed. With high-resolution DEM now becoming more readily available, a multi-scale terrain analysis approach may be a more viable option to filter out the large amount short-range variation present within them, as opposed to coarsening the resolution of a DEM, and thereby more accurately represent soil-landscape processes. To evaluate this hypothesis, two examples are provided. The first study was designed to evaluate the systematic effects of varying both grid and neighborhood size on terrain attributes computed from LiDAR. In a second study, the objective was to examine how the correlations between soil and terrain attributes vary with neighborhood size, so as to provide an empirical measure of what neighborhood size may be most appropriate. Results suggest that the overall representation of the land surface by terrain attributes is specific to the land surface, but also that the terrain attributes vary independently in response to spatial extent over which they are computed. Results also indicate that finer grid sizes are more sensitive to the scale of terrain attribute calculation than larger grid sizes. For the soil properties examined in this study, slope curvatures produced the highest coefficients of correlation when calculated at neighborhood sizes between 117 and 189 m.
Maynard JJ, Johnson M G. Scale-dependency of LiDAR derived terrain attributes in quantitative soil-landscape modeling: Effects of grid resolution vs. neighborhood extent. , 2014, 230-231: 29-40.http://www.sciencedirect.com/science/article/pii/S0016706114001335
61Neighborhood extent is the main factor controlling soil-topography correlations.61Grid resolution affects the accuracy of terrain attributes at sampling locations.61Fine scale (1–5m) DEMs did not provide stronger predictors of soil properties.61LiDAR's high cost and computational requirements limit utility for soil modeling.
[HuXuemei, QinChengzhi.Analysis on the approach to determine an appropriate window size for grid-based digital terrain. , 2017, 42(10): 1365-1372.]
Zhu AX, YangL, Li BL, et al.Construction of membership functions for predictive soil mapping under fuzzy logic. , 2010, 155(3/4): 166-174.http://www.cabdirect.org/abstracts/20103238810.html;jsessionid=7DA9DE4B83BBCFFA150C44D1B93936F2;jsessionid=0C5D7305F90BDDF6658B50D687DB7601
Fuzzy membership function is an effective tool to represent relationship between soil and environment for predictive soil mapping. Usually construction of a fuzzy membership function requires knowledge on soil-landscape relationships obtained from local soil experts or from extensive field samples. For areas with no soil survey experts and no extensive soil field observations, a purposive sampling approach could provide the descriptive knowledge on the relationships. However, quantifying this descriptive knowledge in the form of fuzzy membership functions for predictive soil mapping is a challenge. This paper presents a method to construct fuzzy membership functions using descriptive knowledge. Construction of fuzzy membership functions is accomplished based on two types of knowledge: 1) knowledge on typical environmental conditions of each soil type and 2) knowledge on how each soil type corresponds to changes in environmental conditions. These two types of knowledge can be extracted from catenary sequences of soil types and the associated environment information collected at a few field samples through purposive sampling. The proposed method was tested in a watershed located in Heshan farm of Nenjiang County in Heilongjiang Province of China. A set of membership functions were constructed to represent the descriptive knowledge on soil-landscape relationships, which were derived from 22 field samples collected through a purposive sampling approach. A soil subgroup map and an A-horizon soil organic matter content map for the area were generated using these membership functions. Forty five field validation points were collected independently to evaluate the two soil maps. The soil subgroup map achieved 76% of accuracy. The A-horizon soil organic matter content map based on the derived fuzzy membership functions was compared with that derived from a multiple linear regression model. The comparison showed that the soil organic content map based on fuzzy membership functions performed better than the soil map based on the linear regression model. The proposed method could also be used to construction membership functions from descriptive knowledge obtained from other sources.
YangL, Zhu AX, QiF, et al.An integrative hierarchical stepwise sampling strategy for spatial sampling and its application in digital soil mapping. , 2013, 27(1): 1-23.http://www.tandfonline.com/doi/abs/10.1080/13658816.2012.658053
Sampling design plays an important role in spatial modeling. Existing methods often require a large amount of samples to achieve desired mapping accuracy, but imply considerable cost. When there are not enough resources for collecting a large set of samples at once, stepwise sampling approach is often the only option for collecting the needed large sample set, especially in the case of field surveying over large areas. This article proposes an integrative hierarchical stepwise sampling strategy which makes the samples collected at different stages an integrative one. The strategy is based on samples' representativeness of the geographic feature at different scales. The basic idea is to sample at locations that are representative of large-scale spatial patterns first and then add samples that represent more local patterns in a stepwise fashion. Based on the relationships between a geographic feature and its environmental covariates, the proposed sampling method approximates a hierarchy of spatial variations of the geographic feature under concern by delineating natural aggregates (clusters) of its relevant environmental covariates at different scales. The natural occurrence of such aggregates is modeled using a fuzzy c-means clustering method. We iterate through different numbers of clusters from only a few to many more to be able to reveal clusters at different spatial scales. At a particular iteration, locations that bear high similarity to the cluster prototypes are identified. If a location is consistently identified at multiple iterations, it is then considered to be more representative of the general or large-scale spatial patterns. Locations that are identified less during the iterations are representative of local patterns. The integrative stepwise sampling design then gives higher sampling priority to the locations that are more representative of the large-scale patterns than local ones. We applied this sampling design in a digital soil mapping case study. Different representative samples were obtained and used for soil inference. We started with samples that are the most representative of the large-scale patterns and then gradually included the samples representative of local patterns. Field evaluation indicated that the additions of more samples with lower representativeness lead to improvements of accuracy with a decreasing marginal gain. When cost-effectiveness is considered, the representative grade could provide essential information on the number and order of samples to be sampled for an effective sampling design.
[QinChengzhi, LuYanjun, BaoLili, et al.Simple digital terrain analysis software (SimDTA 1.0) and its application in fuzzy classification of slope positions: A case of a small catchment in Nenjiang watershed, northeast China. , 2009, 11(6): 737-743.]
Zhu AX, Band LE, VertessyR, et al.Derivation of soil properties using a soil land inference model (SoLIM). , 1997, 61: 523-533.https://www.soils.org/publications/sssaj/abstracts/61/2/SS0610020523
SoLIM (Soil Land Inference Model) is a fuzzy inference scheme for estimating and representing the spatial distribution of soil types in a landscape. This study developed the inference method a step further to derive continuous soil property maps through two case studies. The first case illustrates the derivation of soil A horizon depth in a mountainous area in western Montana. It was found that the inferred depths are a closer fit to observed depths than those derived from the conventional soil map at both spatial and attribute levels. The second case shows the derivation of soil transmissivity values across a small catchment with a gentle environmental variation in Tumut, NSW, Australia. This case shows that the derived soil transmissivity map is comparable to the results from systematic field survey over a small area. SoLIM works well in an area where there is a good understanding of the relationships between soils and their formative environment and where the soil formative environment can be characterised using current geographical information system techniques. However, we experienced difficulty with the methodology when it was applied in an area where the environmental gradient is gentle and the soil formative environment cannot be very well described using the primitive environmental indices currently employed in SoLIM.
Park SJ, van de GiesenN. Soil-landscape delineation to define spatial sampling domains for hillslope hydrology. , 2004, 295: 28-46.http://linkinghub.elsevier.com/retrieve/pii/S0022169404001155
Soil hydrological properties are highly variable in space. Field measurements of these properties are costly and error prone. As spatially distributed approaches become increasingly important in current hydrological and ecological modeling, an appropriate field sampling scheme to effectively capture spatial variability of hydrological processes becomes essential. A terrain-based slope classification system was applied to delineate the hillslope into representative hydrological domains. This model assumes that there are hydrological landscape units (LUs) along the hillslope in which distinct sets of hydrological and pedological processes occur. Possible water and material flows over the hillslope were first interpreted using a continuity equation of mass flow over the surface, and subsequently included in a terrain analysis. The developed terrain index is able to characterize the hydrological processes, accommodating both continuous and discrete concepts. The model was tested against the intensive soil moisture data at the Tarrawarra catchment, Australia [Water Resour. Res. 34 (1998) 2765]. The delineated soil-揕Us explain up to 73% of the average soil moisture variation when it is combined with other terrain parameters (surface curvature, upslope contributing area and slope aspect). Soil moisture at each LU shows significantly different variance characteristics when compared with other units, and the delineation procedure reduces the spatial variation of soil moisture within each LU. Random permutation and bootstrapping techniques indicate that stratified random sampling based on the delineated hillslope units significantly reduces the number of samples needed to estimate the average soil moisture and the overall error of estimation.
Qin CZ, Zhu AX, ShiX, et al.Quantification of spatial gradation of slope positions. , 2009, 110: 152-161.http://linkinghub.elsevier.com/retrieve/pii/S0169555X0900155X
Transition between slope positions (e.g., ridge, shoulder slope, back slope, foot slope, and valley) is often gradual. Quantification of spatial transitions or spatial gradations between slope positions can increase the accuracy of terrain parameterization for geographical or ecological modeling, especially for digital soil mapping at a fine scale. Current models for characterizing the spatial gradation of slope positions based on a gridded DEM either focus solely on the parameter space or depend on too many rules defined by topographic attributes, which makes such approaches impractical. The typical locations of a slope position contain the characteristics of the slope position in both parameter space and spatial context. Thus, the spatial gradation of slope positions can be quantified by comparing terrain characteristics (spatial and parametrical) of given locations to those at typical locations. Based on this idea, this paper proposes an approach to quantifying the spatial gradation of slope positions by using typical locations as prototypes. This approach includes two parts: the first is to extract the typical locations of each slope position and treat them as the prototypes of this position; and the second is to compute the similarity between a given location and the prototypes based on both local topographic attributes and spatial context. The new approach characterizes slope position gradation in both the attribute domain (i.e., parameter space) and the spatial domain (i.e., geographic space) in an easy and practicable way. Applications show that the new approach can quantitatively describe spatial gradations among a set of slope positions. Comparison of spatial gradation of A-horizon sand percentages with the quantified spatial gradation of slope positions indicates that the latter reflects slope processes, confirming the effectiveness of the approach. The comparison of a soil subgroup map of the study area with the maximum similarity map derived from the approach also suggests that the quantified spatial gradation of slope position can be used to aid geographical modeling such as digital soil mapping.
GharariS, HrachowitzM, FeniciaF, et al.Hydrological landscape classification: investigating the performance of HAND based landscape classifications in a central European meso-scale catchment. , 2011, 15(11): 3275-3291.http://www.hydrol-earth-syst-sci.net/15/3275/2011/
This paper presents a detailed performance and sensitivity analysis of a recently developed hydrological landscape classification method based on dominant runoff mechanisms. Three landscape classes are distinguished: wetland, hillslope and plateau, corresponding to three dominant hydrological regimes: saturation excess overland flow, storage excess sub-surface flow, and deep percolation. Topography, geology and land use hold the key to identifying these landscapes. The height above the nearest drainage (HAND) and the surface slope, which can be easily obtained from a digital elevation model, appear to be the dominant topographical controls for hydrological classification. In this paper several indicators for classification are tested as well as their sensitivity to scale and resolution of observed points (sample size). The best results are obtained by the simple use of HAND and slope. The results obtained compared well with the topographical wetness index. The HAND based landscape classification appears to be an efficient method to ''read the landscape'' on the basis of which conceptual models can be developed.
Skidmore AK.Terrain positions mapped from a gridded digital elevation model. , 1990, 4(1): 33-49.https://www.tandfonline.com/doi/full/10.1080/02693799008941527
Terrain position (e.g., ridge, mid-slope, valley) is a potentially useful variable with which to model environmental parameters and processes using geographical information systems. Digital elevation data spaced on a regular 30 m grid were generated over an area of flat to moderate topography in south-east Australia. Streams and ridges were mapped from the digital elevation model using a new algorithm that utilizes basic geographical principles. Ridge and stream lines closely followed the original contour map and improved upon the results from three alternative algorithms. Mid-slope positions were successfully interpolated from the stream and ridge lines by a modified measure of Euclidean distance.
GrimmR, BehrensT, MärkerM, et al.Soil organic carbon concentrations and stocks on Barro Colorado Island: Digital soil mapping using random forests analysis. , 2008, 146: 102-113.http://linkinghub.elsevier.com/retrieve/pii/S0016706108001262
Spatial estimates of tropical soil organic carbon (SOC) concentrations and stocks are crucial to understanding the role of tropical SOC in the global carbon cycle. They also allow for spatial variation of SOC in environmental process models. SOC is spatially highly variable. In traditional approaches, SOC concentrations and stocks have been derived from estimates for single or very few profiles and spatially linked to existing units of soil or vegetation maps. However, many existing soil profile data are incomplete and untested as to whether they are representative or unbiased. Also single means for soil or vegetation map units cannot characterize SOC spatial variability within these units. We here use the digital soil mapping approach to predict the spatial distribution of SOC. This relies on a soil inference model based on spatially referenced environmental layers of topographic attributes, soil units, parent material, and forest history. We sampled soils at 165 sites, stratified according to topography and lithology, on Barro Colorado Island (BCI), Panama, at depths of 0–10cm, 10–20cm, 20–30cm, and 30–50cm, and analyzed them for SOC by dry combustion. We applied Random Forest (RF) analysis as a modeling tool to the SOC data for each depth interval in order to compare vertical and lateral distribution patterns. RF has several advantages compared to other modeling approaches, for instance, the fact that it is neither sensitive to overfitting nor to noise features. The RF-based digital SOC mapping approach provided SOC estimates of high spatial resolution and estimates of error and predictor importance. The environmental variables that explained most of the variation in the topsoil (0–10cm) were topographic attributes. In the subsoil (10–50cm), SOC distribution was best explained by soil texture classes as derived from soil mapping units. The estimates for SOC stocks in the upper 30cm ranged between 38 and 116Mg ha 61021 , with lowest stocks on midslope and highest on toeslope positions. This digital soil mapping approach can be applied to similar landscapes to refine the spatial resolution of SOC estimates.
GenuerR, Poggi JM, Tuleau-MalotC.Variable selection using random forests. , 2010, 31: 2225-2236.http://linkinghub.elsevier.com/retrieve/pii/S0167865510000954
This paper proposes, focusing on random forests, the increasingly used statistical method for classification and regression problems introduced by Leo Breiman in 2001, to investigate two classical issues of variable selection. The first one is to find important variables for interpretation and the second one is more restrictive and try to design a good parsimonious prediction model. The main contribution is twofold: to provide some experimental insights about the behavior of the variable importance index based on random forests and to propose a strategy involving a ranking of explanatory variables using the random forests score of importance and a stepwise ascending variable introduction strategy.