Background/Objectives: Circulating tumor cells (CTCs) are important biomarkers for predicting prognosis and evaluating treatment efficacy in cancer. We developed the “CTC-Chip” system based on microfluidics, enabling highly sensitive CTC detection and prognostic assessment in lung cancer and malignant pleural mesothelioma. However, the final identification and enumeration of CTCs require manual intervention, which is time-consuming, prone to human error, and necessitates the involvement of experienced medical professionals. Medical image recognition using machine learning can reduce workload and improve automation. However, CTCs are rare in clinical samples, limiting the training data available to construct a robust CTC image recognition system. In this study, we established a highly accurate artificial intelligence-based CTC recognition system by pre-training convolutional neural networks using images from lung cancer cell lines. Methods: We performed transfer learning of convolutional neural networks. Initially, the models were pre-trained using images obtained from lung cancer cell lines. The model’s accuracy was improved by training with a limited number of clinical CTC images. Results: Transfer learning significantly improved the CTC classification accuracy to an average of 99.51%, compared to 96.96% for a model trained solely on pre-trained cell lines (p< 0.05). This approach showed notable efficacy when clinical training images were limited, achieving statistically significant accuracy improvements with as few as 17 clinical CTC images (p< 0.05). Conclusions: Overall, our findings demonstrate that pre-training with cancer cell lines enables rapid and highly accurate automated CTC recognition even with limited clinical data, significantly enhancing clinical applicability and potential utility across diverse cancer diagnostic workflows.
背景/目的:循环肿瘤细胞(CTCs)是预测癌症预后及评估治疗效果的重要生物标志物。我们开发了基于微流控技术的“CTC-Chip”系统,能够实现对肺癌和恶性胸膜间皮瘤中CTC的高灵敏度检测与预后评估。然而,CTC的最终识别与计数仍需人工干预,该过程耗时耗力、易受人为误差影响,且需要经验丰富的医疗专业人员参与。基于机器学习的医学图像识别技术可有效减轻工作负担并提升自动化水平。但临床样本中CTC数量稀少,导致可用于构建稳健CTC图像识别系统的训练数据有限。本研究通过使用肺癌细胞系图像对卷积神经网络进行预训练,建立了一套高精度的人工智能CTC识别系统。方法:我们采用卷积神经网络迁移学习策略。首先利用肺癌细胞系图像对模型进行预训练,随后通过有限数量的临床CTC图像进行训练以提升模型精度。结果:迁移学习使CTC分类准确率从仅使用预训练细胞系模型的96.96%显著提升至平均99.51%(p<0.05)。该方法在临床训练图像有限的情况下展现出显著优势,仅使用17张临床CTC图像即可实现统计学显著的准确率提升(p<0.05)。结论:本研究证实,通过癌细胞系预训练策略,即使在临床数据有限的情况下也能实现快速、高精度的自动化CTC识别,显著提升了该系统在不同癌症诊断流程中的临床适用性与潜在应用价值。