Hematopoietic stem cell transplantation (HSCT) is a life-saving therapy for hematologic malignancies, such as leukemia and lymphoma and other severe conditions but is associated with significant risks, including graft versus host disease (GVHD), relapse, and treatment-related mortality. The increasing complexity of clinical, genomic, and biomarker data has spurred interest in machine learning (ML), which has emerged as a transformative tool to enhance decision-making and optimize outcomes in HSCT. This review examines the applications of ML in HSCT, focusing on donor selection, conditioning regimen, and prediction of post-transplant outcomes. Machine learning approaches, including decision trees, random forests, and neural networks, have demonstrated potential in improving donor compatibility algorithms, mortality and relapse prediction, and GVHD risk stratification. Integrating “omics” data with ML models has enabled the identification of novel biomarkers and the development of highly accurate predictive tools, supporting personalized treatment strategies. Despite promising advancements, challenges persist, including data standardization, algorithm interpretability, and ethical considerations regarding patient privacy. While ML holds promise for revolutionizing HSCT management, addressing these barriers through multicenter collaborations and regulatory frameworks remains essential for broader clinical adoption. In addition, the potential of ML can cope with some challenges such as data harmonization, patients’ data protection, and availability of adequate infrastructure. Future research should prioritize larger datasets, multimodal data integration, and robust validation methods to fully realize ML’s transformative potential in HSCT.
造血干细胞移植(HSCT)是治疗白血病、淋巴瘤等血液系统恶性肿瘤及其他严重疾病的一种挽救生命的疗法,但同时也伴随着显著风险,包括移植物抗宿主病(GVHD)、疾病复发和治疗相关死亡率。临床、基因组和生物标志物数据的日益复杂性激发了人们对机器学习(ML)的兴趣,该技术已成为提升HSCT决策水平、优化治疗结果的变革性工具。本综述探讨了机器学习在HSCT中的应用,重点关注供体选择、预处理方案及移植后结局预测。决策树、随机森林和神经网络等机器学习方法在改进供体配型算法、预测死亡与复发风险、进行GVHD风险分层等方面展现出潜力。通过整合多组学数据与机器学习模型,研究者能够识别新型生物标志物并开发高精度预测工具,从而支持个体化治疗策略。尽管取得显著进展,数据标准化、算法可解释性及患者隐私相关的伦理问题等挑战依然存在。虽然机器学习有望革新HSCT临床管理,但通过多中心合作与监管框架解决这些障碍,对于推动其更广泛的临床应用至关重要。此外,机器学习在应对数据标准化、患者数据保护及基础设施完备性等挑战方面具有潜力。未来研究应聚焦更大规模数据集、多模态数据整合及稳健的验证方法,以充分发挥机器学习在HSCT领域的变革性潜力。