• 研究论文 •

基于机器学习模型的区域土壤重金属空间预测精度比较研究

1. 山东师范大学地理与环境学院,济南 250358
• 收稿日期:2021-06-21 接受日期:2021-09-13 出版日期:2022-06-10 发布日期:2022-08-10
• 通讯作者: 吕建树（1986-）,男,山东莱芜人,博士,副教授,主要从事重金属环境地球化学及地质统计学相关研究。E-mail: lvjianshu@126.com
• 作者简介:金昭（1999-）,女,山东济南人,硕士研究生,主要研究方向为土壤污染物空间建模。 E-mail: geostatistical@163.com
• 基金资助:
山东省自然科学基金优秀青年基金项目(ZR2020YQ31);国家自然科学基金项目(41601549)

Comparison of the accuracy of spatial prediction for heavy metals in regional soils based on machine learning models

JIN Zhao(), LV Jianshu()

1. College of Geography and Environment, Shandong Normal University, Jinan 250358, China
• Received:2021-06-21 Accepted:2021-09-13 Published:2022-06-10 Online:2022-08-10

Abstract:

In order to identify the spatial variation of regional soil heavy metals and clarify the relevant influencing factors, this work built multiple linear regression (MLR), elastic network regression (ENR), random forest (RF), stochastic gradient boosting (SGB), ensembled model based on stacking, Back-Propagation artificial neural network (BP-ANN), neural network ensemble based on model averaging (avNNet), support vector machine with linear kernel (SVM-L), and support vector machine with radial basis function kernel (SVM-R); and applied these nine machine learning models to a dataset consisting of soil Cd, Cu, Hg, Pb, Zn concentrations and environmental auxiliary variables in the central part of Shandong Province. Finally, the spatial prediction accuracy derived from nine models was compared. It was confirmed that RF outperformed other models, with R2 values among 0.263 and 0.448, while MAE and RMSE below 8.408 and 10.636, respectively, and P/O approximating to 1. Thus, RF can be regarded as the optimal model for spatial prediction of soil heavy metals. Besides, SVM-R showed ideal predictive accuracy, and can serve as the alternative model. The accuracy for other seven models were obviously inferior to RF and SVM-R. Soil heavy metals in the study area showed similar spatial patterns with concentrations following the decreasing trend from northeast to southwest according to RF. The regions of high heavy metals contents were located in northeastern, northern, and southern parts, coherent with the industrial sites and road networks, indicating that human activities are a significant influencing factor for spatial distributions of heavy metals in soils. This work can provide an important reference for regional soil pollution management.