A gradient boost approach for predicting near-road ultrafine particle concentrations using detailed traffic characterization
- Additional Document Info
- View All
This study investigates the influence of meteorology, land use, built environment, and traffic characteristics on near-road ultrafine particle (UFP) concentrations. To achieve this objective, minute-level UFP concentrations were measured at various locations along a major arterial road in the Greater Toronto Area (GTA) between February and May 2019. Each location was visited five times, at least once in the morning, mid-day, and afternoon. Each visit lasted for 30 min, resulting in 2.5 h of minute-level data collected at each location. Local traffic information, including vehicle class and turning movements, were processed using computer vision techniques. The number of fast-food restaurants, cafes, trees, traffic signals, and building footprint, were found to have positive impacts on the mean UFP, while distance to the closest major road was negatively associated with UFP. We employed the Extreme Gradient Boosting (XGBoost) method to develop prediction models for UFP concentrations. The Shapley additive explanation (SHAP) measures were used to capture the influence of each feature on model output. The model results demonstrated that minute-level counts of local traffic from different directions had significant impacts on near-road UFP concentrations, model performance was robust under random cross-validation as coefficients of determination (R2) ranged from 0.63 to 0.69, but it revealed weaknesses when data at specific locations were eliminated from the training dataset. This result indicates that proper cross-validation techniques should be developed to better evaluate machine learning models for air quality predictions.
has subject area