Background:Gross tumor volume (GTV) segmentation of Nasopharyngeal Carcinoma (NPC) crucially determines the precision of image-guided radiation therapy (IGRT) for NPC. Compared to other cancers, the clinical delineation of NPC is especially challenging due to its capricious infiltration of the adjacent rich tissues and bones, and it routinely requires multimodal information from CT and MRI series to identify its ambiguous tumor boundary. However, the conventional deep learning-based multimodal segmentation method suffers from limited prediction accuracy and frequently performs as well as or worse than single-modality segmentation models. The limited multimodal prediction performance indicates defective information extraction and integration from the input channels. This study aims to develop a 3D Gaussian-prompted Diffusion Model (3DG-PDM) for more clinically targeted information extraction and effective multimodal information integration, thereby facilitating more accurate and clinically interpretable GTV segmentation for NPC.Methods:We propose a 3D-Gaussian-Prompted Diffusion Model (3DGS-PDM) that operates NPC tumor contouring in multimodal clinical priors through a guided stepwise process. The proposed model contains two modules: a Gaussian Initialization Module that utilizes a 3D-Gaussian-Splatting technique to distill 3D-Gaussian representations based on clinical priors from CT, MRI-t2 and MRI-t1-contract-enhanced-fat-suppression (MRI-t1-cefs), respectively, and a Diffusion Segmentation Module that generates tumor segmentation step-by-step from the fused 3D-Gaussians prompts. We retrospectively collected data on 600 NPC patients from four hospitals through paired CT, MRI series and clinical GTV annotations, and divided that dataset into 480 training volumes and 120 testing volumes.Results:Our proposed method can achieve a mean dice similarity cofficient (DSC) of 84.29 ± 7.33, a mean average symmetric surface distance (ASSD) of 1.31 ± 0.63, and a 95th percentile of Hausdorff (HD95) of 4.76 ± 1.98 on primary NPC tumor (GTVp) segmentation, and a DSC of 79.25 ± 10.01, an ASSD of 1.19 ± 0.72 and anHD95of 4.76 ± 1.71 on metastasis NPC tumor (GTVnd) segmentation. Comparative experiments further demonstrate that our method can significantly improve the multimodal segmentation performance on NPC tumors, with superior advantages over five other state-of-the-art comparative methods. Visual evaluation on the segmentation prediction process and a three-step ablation study on input channels further demonstrate the interpretability of our proposed method.Conclusions:This study proposes a performant and interpretable multimodal segmentation method for GTV of NPC, contributing greatly to precision improvement for NPC therapy treatment.
背景:鼻咽癌(NPC)大体肿瘤体积(GTV)的勾画对鼻咽癌图像引导放射治疗(IGRT)的精度至关重要。与其他癌症相比,鼻咽癌因其对邻近丰富组织和骨骼的浸润多变,临床勾画尤为困难,通常需要结合CT和MRI系列的多模态信息来确定其模糊的肿瘤边界。然而,传统的基于深度学习的多模态分割方法预测精度有限,其表现往往与单模态分割模型相当甚至更差。有限的多模态预测性能表明,从输入通道中提取和整合信息存在缺陷。本研究旨在开发一种三维高斯提示扩散模型(3DG-PDM),以实现更具临床针对性的信息提取和有效的多模态信息整合,从而促进鼻咽癌GTV更准确且临床可解释的分割。 方法:我们提出了一种三维高斯提示扩散模型(3DGS-PDM),该模型通过引导式逐步过程在多模态临床先验信息中进行鼻咽癌肿瘤轮廓勾画。所提出的模型包含两个模块:一个高斯初始化模块,利用三维高斯泼溅技术分别从CT、MRI-T2和MRI-T1对比增强脂肪抑制(MRI-t1-cefs)的临床先验信息中提取三维高斯表示;以及一个扩散分割模块,该模块从融合的三维高斯提示中逐步生成肿瘤分割结果。我们回顾性收集了来自四家医院的600例鼻咽癌患者的配对CT、MRI系列及临床GTV标注数据,并将该数据集分为480个训练体积和120个测试体积。 结果:我们提出的方法在原发性鼻咽癌肿瘤(GTVp)分割上可实现平均Dice相似系数(DSC)为84.29 ± 7.33,平均对称表面距离(ASSD)为1.31 ± 0.63,以及95%豪斯多夫距离(HD95)为4.76 ± 1.98;在转移性鼻咽癌肿瘤(GTVnd)分割上,DSC为79.25 ± 10.01,ASSD为1.19 ± 0.72,HD95为4.76 ± 1.71。对比实验进一步表明,我们的方法能显著提升鼻咽癌肿瘤的多模态分割性能,相较于其他五种先进的对比方法具有明显优势。对分割预测过程的视觉评估以及对输入通道的三步消融研究进一步证明了我们所提方法的可解释性。 结论:本研究提出了一种性能优异且可解释的鼻咽癌GTV多模态分割方法,对提升鼻咽癌治疗的精准度具有重要贡献。