肿瘤(癌症)患者之家
首页
癌症知识
肿瘤中医药治疗
肿瘤药膳
肿瘤治疗技术
前沿资讯
临床试验招募
登录/注册
VIP特权
广告
广告加载中...

文章:

蒙特卡洛梯度提升树在癌症分期中的应用:一种机器学习方法

Monte Carlo Gradient Boosted Trees for Cancer Staging: A Machine Learning Approach

原文发布日期:24 July 2025

DOI: 10.3390/cancers17152452

类型: Article

开放获取: 是

 

英文摘要:

Machine learning algorithms are commonly employed for classification and interpretation of high-dimensional data. The classification task is often broken down into two separate procedures, and different methods are applied to achieve accurate results and produce interpretable outcomes. First, an effective subset of high-dimensional features must be extracted and then the selected subset will be used to train a classifier. Gradient Boosted Trees (GBT) is an ensemble model and, particularly due to their robustness, ability to model complex nonlinear interactions, and feature interpretability, they are well suited for complex applications. XGBoost (eXtreme Gradient Boosting) is a high-performance implementation of GBT that incorporates regularization, parallel computation, and efficient tree pruning that makes it a suitable efficient, interpretable, and scalable classifier with potential applications to medical data analysis. In this study, a Monte Carlo Gradient Boosted Trees (MCGBT) model is proposed for both feature reduction and classification. The proposed MCGBT method was applied to a lung cancer dataset for feature identification and classification. The dataset contains 107 radiomics which are quantitative imaging biomarkers extracted from CT scans. A reduced set of 12 radiomics were identified, and patients were classified into different cancer stages. Cancer staging accuracy of 90.3% across 100 independent runs was achieved which was on par with that obtained using the full set of 107 radiomics, enabling lean and deployable classifiers.

 

摘要翻译: 

机器学习算法常用于高维数据的分类与解释。分类任务通常分解为两个独立步骤,并采用不同方法以实现精确结果并生成可解释的输出。首先需提取高维特征的有效子集,随后利用选定子集训练分类器。梯度提升树(GBT)作为一种集成模型,因其鲁棒性、复杂非线性交互建模能力及特征可解释性,特别适用于复杂应用场景。XGBoost(极限梯度提升)作为GBT的高性能实现,融合了正则化、并行计算与高效剪枝技术,使其成为兼具高效性、可解释性与可扩展性的分类器,在医学数据分析领域具有应用潜力。本研究提出蒙特卡洛梯度提升树(MCGBT)模型,同步实现特征降维与分类。将该方法应用于肺癌数据集进行特征识别与分类,该数据集包含107个从CT影像中提取的定量影像组学特征。最终识别出12个关键影像组学特征子集,并实现患者癌症分期分类。经过100次独立运行验证,癌症分期准确率达90.3%,与使用全部107个特征的结果相当,从而构建出精简且可部署的分类模型。

 

 

原文链接:

Monte Carlo Gradient Boosted Trees for Cancer Staging: A Machine Learning Approach

广告
广告加载中...