Background/Objectives:Detecting lung nodules on computed tomography (CT) images is critical for diagnosing thoracic cancers. Deep learning models, particularly convolutional neural networks (CNNs), show promise in automating this process. This systematic review and meta-analysis aim to evaluate the diagnostic accuracy of these models, focusing on lesion-wise sensitivity as the primary metric.Methods:A comprehensive literature search was conducted, identifying 48 studies published up to 7 November 2023. The pooled diagnostic performance was assessed using a random-effects model, with lesion-wise sensitivity as the key outcome. Factors influencing model performance, including participant demographics, dataset privacy, and data splitting methods, were analyzed. Methodological rigor was maintained through the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tools. Trial Registration: This review is registered with PROSPERO under CRD42023479887.Results:The meta-analysis revealed a pooled sensitivity of 79% (95% CI: 72–86%) for independent datasets and 85% (95% CI: 83–88%) across all datasets. Variability in performance was associated with dataset characteristics and study methodologies.Conclusions:While deep learning models demonstrate significant potential in lung nodule detection, the findings highlight the need for more diverse datasets, standardized evaluation protocols, and interventional studies to enhance generalizability and clinical applicability. Further research is necessary to validate these models across broader patient populations.
背景/目的:在计算机断层扫描(CT)图像中检测肺结节对于诊断胸部癌症至关重要。深度学习模型,特别是卷积神经网络(CNNs),在自动化这一过程中显示出潜力。本系统综述与荟萃分析旨在评估这些模型的诊断准确性,重点关注以病灶为单位的灵敏度作为主要指标。 方法:进行了全面的文献检索,截至2023年11月7日共纳入48项研究。采用随机效应模型评估汇总诊断性能,以病灶灵敏度为核心结果。分析了影响模型性能的因素,包括参与者人口统计学特征、数据集隐私性及数据分割方法。通过医学影像人工智能清单(CLAIM)和诊断准确性研究质量评估工具-2(QUADAS-2)保持方法学严谨性。试验注册:本综述已在PROSPERO注册,编号CRD42023479887。 结果:荟萃分析显示,独立数据集的汇总灵敏度为79%(95% CI:72–86%),所有数据集的汇总灵敏度为85%(95% CI:83–88%)。性能差异与数据集特征及研究方法相关。 结论:尽管深度学习模型在肺结节检测中展现出显著潜力,但研究结果强调需要更多样化的数据集、标准化评估方案以及干预性研究,以提升模型的普适性和临床适用性。未来需开展进一步研究,在更广泛的患者群体中验证这些模型。