Development of Explainable Machine Learning Models to Identify Patients at Risk for 1-Year Mortality and New Distant Metastases Postendoprosthetic Reconstruction for Lower Extremity Bone Tumors: A Secondary Analysis of the PARITY Trial. Journal Articles uri icon

  •  
  • Overview
  •  
  • Research
  •  
  • Identity
  •  
  • Additional Document Info
  •  
  • View All
  •  

abstract

  • BACKGROUND: Accurate prediction of postoperative metastasis and mortality risks in patients undergoing lower-limb oncological resection and endoprosthetic reconstruction is essential for guiding adjuvant therapies and managing patient expectations. Current prediction methods are limited by variability in patient-specific factors. This study aims to develop and internally validate explainable machine learning (ML) models to predict the 1-year risk of new distant metastases and mortality in these patients. METHODS: We performed a secondary analysis of data from the Prophylactic Antibiotic Regimens in Tumor Surgery trial, which included 604 patients. Candidate features were selected based on availability and clinical relevance and then narrowed using Least Absolute Shrinkage and Selection Operator (LASSO) regression and Boruta algorithms. Six ML classification algorithms were tuned and calibrated: logistic regression, support vector machines, random forest, Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost), and neural networks. Models were developed with and without including percent tumor necrosis due to its high missing data rate (>30%). Hyperparameters were tuned using Bayesian optimization. Internal validation was conducted using a 30% hold-out set. Model explainability was assessed using permutation-based feature importance and SHapley Additive exPlanations. RESULTS: LightGBM was identified as the best-performing algorithm for both outcomes. For 1-year mortality prediction without percent necrosis, LightGBM achieved an area under the receiver operating characteristic curve (AUC-ROC) of 0.78 (95% confidence interval [CI] 0.70-0.86) during cross-validation and 0.72 on internal validation. For distant metastasis prediction, the LightGBM model without percent necrosis achieved an AUC-ROC of 0.77 (95% CI 0.71-0.84) during cross-validation and 0.77 on internal validation. Including percent necrosis did not significantly improve model performance. The top predictors identified were patient age, largest tumor dimension, and tumor stage. CONCLUSIONS: Explainable ML models can effectively predict the 1-year risk of mortality and new distant metastases in patients undergoing lower-limb oncological resection and endoprosthetic reconstruction. Further external validation and consideration of other data modalities are required before integrating these ML-driven risk assessments into routine clinical practice. LEVEL OF EVIDENCE: Level II, Prognostic Study. See Instructions for Authors for a complete description of levels of evidence.

authors

  • Deng, Jiawen
  • Moskalyk, Myron
  • Nayan, Madhur
  • Aoude, Ahmed
  • Ghert, Michelle
  • Bhatnagar, Sahir
  • Bozzo, Anthony

publication date

  • 2025