Background: The early detection of benign and malignant lung tumors enabled patients to diagnose lesions and implement appropriate health measures earlier, dramatically improving lung cancer patients’ quality of living. Machine learning methods performed admirably when recognizing small benign and malignant lung nodules. However, exploration and investigation are required to fully leverage the potential of machine learning in distinguishing between benign and malignant small lung nodules. Objective: The aim of this study was to develop and evaluate the ResNet50-Ensemble Voting model for detecting the benign and malignant nature of small pulmonary nodules (<20 mm) based on CT images. Methods: In this study, 834 CT imaging data from 396 patients with small pulmonary nodules were gathered and randomly assigned to the training and validation sets in an 8:2 ratio. ResNet50 and VGG16 algorithms were utilized to extract CT image features, followed by XGBoost, SVM, and Ensemble Voting techniques for classification, for a total of ten different classes of machine learning combinatorial classifiers. Indicators such as accuracy, sensitivity, and specificity were used to assess the models. The collected features are also shown to investigate the contrasts between them. Results: The algorithm we presented, ResNet50-Ensemble Voting, performed best in the test set, with an accuracy of 0.943 (0.938, 0.948) and sensitivity and specificity of 0.964 and 0.911, respectively. VGG16-Ensemble Voting had an accuracy of 0.887 (0.880, 0.894), with a sensitivity and specificity of 0.952 and 0.784, respectively. Conclusion: Machine learning models that were implemented and integrated ResNet50-Ensemble Voting performed exceptionally well in identifying benign and malignant small pulmonary nodules (<20 mm) from various sites, which might help doctors in accurately diagnosing the nature of early-stage lung nodules in clinical practice.
背景:早期检测肺良恶性肿瘤能使患者更早诊断病灶并采取适当健康干预措施,显著提升肺癌患者生存质量。机器学习方法在识别小型肺良恶性结节方面表现出色,但需进一步探索研究以充分发挥其在鉴别小型肺良恶性结节方面的潜力。目的:本研究旨在基于CT图像开发并评估ResNet50-集成投票模型,用于检测小型肺结节(<20 mm)的良恶性。方法:收集396例小型肺结节患者的834份CT影像数据,按8:2比例随机分配至训练集与验证集。采用ResNet50和VGG16算法提取CT图像特征,结合XGBoost、SVM及集成投票技术进行分类,共构建十类机器学习组合分类器。使用准确率、敏感度及特异度等指标评估模型性能,并对提取特征进行可视化对比分析。结果:本研究提出的ResNet50-集成投票算法在测试集中表现最优,准确率达0.943(0.938,0.948),敏感度与特异度分别为0.964和0.911。VGG16-集成投票模型准确率为0.887(0.880,0.894),敏感度与特异度分别为0.952和0.784。结论:基于ResNet50-集成投票构建的机器学习模型在鉴别多部位小型肺结节(<20 mm)良恶性方面表现优异,有助于临床医生对早期肺结节性质进行精准诊断。