Application of MALDI-TOF MS and machine learning for the detection of SARS-CoV-2 and non-SARS-CoV-2 respiratory infections Journal Articles uri icon

  •  
  • Overview
  •  
  • Research
  •  
  • Identity
  •  
  • Additional Document Info
  •  
  • View All
  •  

abstract

  • ABSTRACT Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) could aid the diagnosis of acute respiratory infections (ARIs) owing to its affordability and high-throughput capacity. MALDI-TOF MS has been proposed for use on commonly available respiratory samples, without specialized sample preparation, making this technology especially attractive for implementation in low-resource regions. Here, we assessed the utility of MALDI-TOF MS in differentiating severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) vs non-COVID acute respiratory infections (NCARIs) in a clinical lab setting in Kazakhstan. Nasopharyngeal swabs were collected from inpatients and outpatients with respiratory symptoms and from asymptomatic controls (ACs) in 2020–2022. PCR was used to differentiate SARS-CoV-2+ and NCARI cases. MALDI-TOF MS spectra were obtained for a total of 252 samples (115 SARS-CoV-2+, 98 NCARIs, and 39 ACs) without specialized sample preparation. In our first sub-analysis, we followed a published protocol for peak preprocessing and machine learning (ML), trained on publicly available spectra from South American SARS-CoV-2+ and NCARI samples. In our second sub-analysis, we trained ML models on a peak intensity matrix representative of both South American (SA) and Kazakhstan (Kaz) samples. Applying the established MALDI-TOF MS pipeline “as is” resulted in a high detection rate for SARS-CoV-2+ samples (91.0%), but low accuracy for NCARIs (48.0%) and ACs (67.0%) by the top-performing random forest model. After re-training of the ML algorithms on the SA-Kaz peak intensity matrix, the accuracy of detection by the top-performing support vector machine with radial basis function kernel model was at 88.0%, 95.0%, and 78% for the Kazakhstan SARS-CoV-2+, NCARI, and AC subjects, respectively, with a SARS-CoV-2 vs rest receiver operating characteristic area under the curve of 0.983 [0.958, 0.987]; a high differentiation accuracy was maintained for the South American SARS-CoV-2 and NCARIs. MALDI-TOF MS/ML is a feasible approach for the differentiation of ARI without specialized sample preparation. The implementation of MALDI-TOF MS/ML in a real clinical lab setting will necessitate continuous optimization to keep up with the rapidly evolving landscape of ARI. IMPORTANCE In this proof-of-concept study, the authors used matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) and machine learning (ML) to identify and distinguish acute respiratory infections (ARI) caused by SARS-CoV-2 versus other pathogens in low-resource clinical settings, without the need for specialized sample preparation. The ML models were trained on a varied collection of MALDI-TOF MS spectra from studies conducted in Kazakhstan and South America. Initially, the MALDI-TOF MS/ML pipeline, trained exclusively on South American samples, exhibited diminished effectiveness in recognizing non-SARS-CoV-2 infections from Kazakhstan. Incorporation of spectral signatures from Kazakhstan substantially increased the accuracy of detection. These results underscore the potential of employing MALDI-TOF MS/ML in resource-constrained settings to augment current approaches for detecting and differentiating ARI.

authors

  • Yegorov, Sergey
  • Kadyrova, Irina
  • Korshukov, Ilya
  • Sultanbekova, Aidana
  • Kolesnikova, Yevgeniya
  • Barkhanskaya, Valentina
  • Bashirova, Tatiana
  • Zhunusov, Yerzhan
  • Li, Yevgeniya
  • Parakhina, Viktoriya
  • Kolesnichenko, Svetlana
  • Baiken, Yeldar
  • Matkarimov, Bakhyt
  • Vazenmiller, Dmitriy
  • Miller, Matthew S
  • Hortelano, Gonzalo
  • Turmukhambetova, Anar
  • Chesca, Antonella E
  • Babenko, Dmitriy

publication date

  • May 2, 2024