Background/Objectives:Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images.Methods:The combination of the ViT’s self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details.Results:On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics.Conclusions:By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model’s interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.

摘要翻译：

背景/目的：肺癌与结肠癌仍是全球范围内最常见且致命的疾病，其早期检测面临严峻挑战。本研究数据来源于肺癌与结肠癌组织病理学图像数据集，该数据集包含五类图像数据，即结肠腺癌、正常结肠组织、肺腺癌、正常肺组织及肺鳞状细胞癌，并按训练集（80%）、验证集（10%）和测试集（10%）进行划分。本研究提出ViT-DCNN（融合可变形卷积网络的视觉变换器）模型，旨在通过医学影像提升癌症检测与分类性能。方法：通过结合视觉变换器的自注意力机制与可变形卷积操作，该模型能够增强特征提取能力，同时学习整体上下文信息与细粒度局部空间特征。结果：在测试集上，该模型表现出卓越性能，准确率达94.24%，F1分数为94.23%，召回率为94.24%，精确率为94.37%，证实了其在癌组织检测中的鲁棒性。此外，本研究提出的ViT-DCNN模型在所有核心性能指标上均优于多种先进模型，包括ResNet-152、EfficientNet-B7、SwinTransformer、DenseNet-201、ConvNext、TransUNet、CNN-LSTM、MobileNetV3及NASNet-A。结论：该模型通过深度学习与先进图像分析技术提升了癌症检测效率，为放射科医师和临床医生提供了有价值的辅助工具。研究表明，ViT-DCNN模型能够降低诊断误差并提高检测效能。未来工作将集中于扩充数据集及增强模型可解释性，以评估其临床适用性。本文展示了人工智能驱动诊断模型在革新肺癌与结肠癌检测技术、提升患者诊断水平方面的应用前景。

原文链接：

ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection

……

文章目录

文章：

ViT-DCNN：基于可变形卷积神经网络模型的视觉Transformer在肺癌与结肠癌检测中的应用

ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection

原文发布日期：15 September 2025

DOI: 10.3390/cancers17183005

类型: Article

开放获取: 是

英文摘要：

摘要翻译：

原文链接：

相关文章

关于我们

官方邮箱

商务合作