Home
Scholarly Works
Improved vertical distribution prediction of soil...
Journal article

Improved vertical distribution prediction of soil VOCs contamination in site-scale utilizing ensemble machine learning approach integrated with molecular descriptors

Abstract

Identifying environmental behaviors and distribution patterns of soil pollutants in site scale facilitates environmental monitoring and management of contamination sites. Notwithstanding the outstanding fitting and robust generalization of machine learning models, the data-driven approaches exhibit relatively low prediction accuracy in industrial sites due to limited soil sample size. The present study enlarged input data amounts for predicting volatile organic compounds (VOCs) distribution in a pesticide factory by merging datasets of individual VOC and integrating molecular descriptors. Four widely used machine learning models were trained and exhibited higher prediction accuracy on the merged VOCs dataset compared to the individual VOC dataset. Furthermore, a stacking ensemble model was constructed to enhance the prediction accuracy, achieving R2 value of 0.809. The Shapley interaction quantification (SHAP-IQ) analysis expounded the interaction effects between soil physicochemical properties, land use functions, and molecular descriptors and revealed the vertical distribution and migration patterns of VOCs. Hazardous material warehouse was the main contamination source in the pesticide factory due to historical manufacture of intermediate products and stacking of wastes. Non-aqueous phase liquids (NAPLs) and contaminant plumes resulted in heterogeneous vertical distribution patterns of VOCs in areas with different distances from the contamination source. Overall, the present research demonstrated the effectiveness of merging small datasets of individual pollutants in site scale for data-driven models and provided new insights for scientific management of contaminated sites.

Authors

Cai Y-X; Chen H-Y; Qu Y-J; Zhao W-H; Wang M-Y; Chen Y; Ma J

Journal

Journal of Hazardous Materials, Vol. 496, ,

Publisher

Elsevier

Publication Date

September 15, 2025

DOI

10.1016/j.jhazmat.2025.139452

ISSN

0304-3894

Labels

Contact the Experts team