The study aimed to develop machine learning (ML) classification models for differentiating patients who needed direct surgery from patients who needed core needle biopsy among patients with prevascular mediastinal tumor (PMT). Patients with PMT who received a contrast-enhanced computed tomography (CECT) scan and initial management for PMT between January 2010 and December 2020 were included in this retrospective study. Fourteen ML algorithms were used to construct candidate classification models via the voting ensemble approach, based on preoperative clinical data and radiomic features extracted from the CECT. The classification accuracy of clinical diagnosis was 86.1%. The first ensemble learning model was built by randomly choosing seven ML models from a set of fourteen ML models and had a classification accuracy of 88.0% (95% CI = 85.8 to 90.3%). The second ensemble learning model was the combination of five ML models, including NeuralNetFastAI, NeuralNetTorch, RandomForest with Entropy, RandomForest with Gini, and XGBoost, and had a classification accuracy of 90.4% (95% CI = 87.9 to 93.0%), which significantly outperformed clinical diagnosis (p< 0.05). Due to the superior performance, the voting ensemble learning clinical–radiomic classification model may be used as a clinical decision support system to facilitate the selection of the initial management of PMT.
本研究旨在开发机器学习分类模型,用于在血管前纵隔肿瘤患者中区分需要直接手术与需要空心针活检的患者。这项回顾性研究纳入了2010年1月至2020年12月期间接受对比增强计算机断层扫描并接受血管前纵隔肿瘤初始治疗的患者。基于术前临床数据及从增强CT影像中提取的放射组学特征,研究采用投票集成方法构建了14种机器学习算法的候选分类模型。临床诊断的分类准确率为86.1%。第一组集成学习模型通过从14种机器学习模型中随机选取7种构建,其分类准确率达88.0%(95% CI = 85.8-90.3%)。第二组集成学习模型由五种算法组合而成,包括NeuralNetFastAI、NeuralNetTorch、基于熵的随机森林、基于基尼系数的随机森林以及XGBoost,其分类准确率提升至90.4%(95% CI = 87.9-93.0%),显著优于临床诊断结果(p<0.05)。基于其卓越性能,这种基于投票集成学习的临床-放射组学分类模型可作为临床决策支持系统,为血管前纵隔肿瘤的初始治疗方案选择提供参考依据。