(1) Background: The aim of our research was to systematically review papers specifically focused on the hepatocellular carcinoma (HCC) diagnostic performance of DL methods based on medical images. (2) Materials: To identify related studies, a comprehensive search was conducted in prominent databases, including Embase, IEEE, PubMed, Web of Science, and the Cochrane Library. The search was limited to studies published before 3 July 2023. The inclusion criteria consisted of studies that either developed or utilized DL methods to diagnose HCC using medical images. To extract data, binary information on diagnostic accuracy was collected to determine the outcomes of interest, namely, the sensitivity, specificity, and area under the curve (AUC). (3) Results: Among the forty-eight initially identified eligible studies, thirty studies were included in the meta-analysis. The pooled sensitivity was 89% (95% CI: 87–91), the specificity was 90% (95% CI: 87–92), and the AUC was 0.95 (95% CI: 0.93–0.97). Analyses of subgroups based on medical image methods (contrast-enhanced and non-contrast-enhanced images), imaging modalities (ultrasound, magnetic resonance imaging, and computed tomography), and comparisons between DL methods and clinicians consistently showed the acceptable diagnostic performance of DL models. The publication bias and high heterogeneity observed between studies and subgroups can potentially result in an overestimation of the diagnostic accuracy of DL methods in medical imaging. (4) Conclusions: To improve future studies, it would be advantageous to establish more rigorous reporting standards that specifically address the challenges associated with DL research in this particular field.
(1)背景:本研究旨在系统综述专门探讨基于医学影像的深度学习(DL)方法对肝细胞癌(HCC)诊断效能的文献。(2)材料:为识别相关研究,我们在Embase、IEEE、PubMed、Web of Science及Cochrane Library等权威数据库进行了全面检索,检索时间截至2023年7月3日。纳入标准为开发或应用DL方法通过医学影像诊断HCC的研究。通过提取诊断准确性的二元数据,获取敏感性、特异性及曲线下面积(AUC)等核心指标。(3)结果:在初步筛选的48项符合条件的研究中,共30项纳入荟萃分析。汇总敏感性为89%(95% CI:87-91),特异性为90%(95% CI:87-92),AUC为0.95(95% CI:0.93-0.97)。根据医学影像方法(增强与非增强影像)、成像模态(超声、磁共振成像、计算机断层扫描)及DL方法与临床医生诊断对比进行的亚组分析均显示DL模型具有可接受的诊断效能。研究间及亚组间存在的发表偏倚与高度异质性可能导致医学影像中DL方法诊断准确性的高估。(4)结论:为提升未来研究质量,建议建立更严格的报告标准,以针对性解决该特定领域DL研究面临的挑战。