肿瘤(癌症)患者之家
首页
癌症知识
肿瘤中医药治疗
肿瘤药膳
肿瘤治疗技术
前沿资讯
临床试验招募
登录/注册
VIP特权
广告
广告加载中...

文章:

比较不同机器学习方法在乳腺癌放疗后毒性预测模型中的性能表现

Comparing Performances of Predictive Models of Toxicity after Radiotherapy for Breast Cancer Using Different Machine Learning Approaches

原文发布日期:25 February 2024

DOI: 10.3390/cancers16050934

类型: Article

开放获取: 是

 

英文摘要:

Purpose. Different ML models were compared to predict toxicity in RT on a large cohort (n = 1314). Methods. The endpoint was RTOG G2/G3 acute toxicity, resulting in 204/1314 patients with the event. The dataset, including 25 clinical, anatomical, and dosimetric features, was split into 984 for training and 330 for internal tests. The dataset was standardized; features with a highp-value at univariate LR and with Spearmanρ>0.8were excluded; synthesized data of the minority were generated to compensate for class imbalance. Twelve ML methods were considered. Model optimization and sequential backward selection were run to choose the best models with a parsimonious feature number. Finally, feature importance was derived for every model. Results. The model’s performance was compared on a training–test dataset over different metrics: the best performance model was LightGBM. Logistic regression with three variables (LR3) selected via bootstrapping showed performances similar to the best-performing models. The AUC of test data is slightly above 0.65 for the best models (highest value: 0.662 with LightGBM). Conclusions. No model performed the best for all metrics: more complex ML models had better performances; however, models with just three features showed performances comparable to the best models using many (n = 13–19) features.

 

摘要翻译: 

目的:本研究旨在比较不同机器学习模型对大规模队列(n=1314)放疗毒性的预测能力。方法:以RTOG G2/G3级急性毒性为终点事件(204/1314例患者发生)。数据集包含25项临床、解剖和剂量学特征,按984例训练集与330例内部测试集划分。数据经标准化处理后,剔除单变量逻辑回归高p值及Spearman相关系数ρ>0.8的特征,并通过合成少数类数据平衡类别分布。共评估12种机器学习方法,通过模型优化与序列后向选择确定特征简约的最优模型,最终解析各模型特征重要性。结果:通过多指标比较训练-测试集性能,LightGBM模型表现最佳。基于自助法筛选三变量的逻辑回归模型(LR3)与最优模型性能相当。最优模型测试集AUC略高于0.65(LightGBM最高达0.662)。结论:所有模型均未在所有指标上表现最优:复杂机器学习模型性能更优,但仅含三个特征的模型与使用13-19个特征的复杂模型性能相当。

 

原文链接:

Comparing Performances of Predictive Models of Toxicity after Radiotherapy for Breast Cancer Using Different Machine Learning Approaches

广告
广告加载中...