Background/Objectives: The prediction of cancer types is primarily reliant on driver genes and their specific mutations. The advancement in novel omics technologies has led to the acquisition of additional genetic data. When integrated with artificial intelligence models, there is considerable potential for this to enhance the accuracy of cancer diagnosis. As mutational signatures can provide insights into repair mechanism malfunctions, they also have the potential for more accurate cancer diagnosis.Methods: First, we compared unsupervised and supervised machine learning approaches to predict cancer types. We employed deep and artificial neural network architectures with an explainable component like layerwise relevance propagation to extract the most relevant features for the cancer-type prediction. Ten-fold cross-validation and an extensive grid search were used to optimize the neural network architecture using driver gene mutations, mutational signatures and topological mutation information as input. The PCAWG dataset was used as input to discriminate between 17 primary sites and 24 cancer types.Results: Overall, our approach showed that the most relevant mutation information to discriminate between cancer types is increased by >10% using the whole genome or intergenic and intronic genome regions instead of exome information. Furthermore, the most relevant features for most cancer types, except for two, are in the mutational signatures and not the topological mutation information.Conclusions: Informative mutational signatures outperformed the prediction of cancer types in comparison to driver gene mutations and added a new layer of diagnostic information. As the degree of information within the mutational signatures is not solely based on the frequency of occurrence, it is even possible to separate cancer types from the same primary site by the different relevant mutations. Furthermore, the comparison of informative mutational signatures allowed the cancer-type assignment of specific impaired repair mechanisms.

摘要翻译：

背景/目的：癌症类型的预测主要依赖于驱动基因及其特定突变。新型组学技术的发展使我们能够获取更多遗传数据。当这些数据与人工智能模型相结合时，有望显著提高癌症诊断的准确性。由于突变特征能够揭示修复机制的功能障碍，它们同样具备提升癌症诊断准确性的潜力。方法：首先，我们比较了无监督与有监督机器学习方法在预测癌症类型方面的表现。我们采用了深度神经网络和人工神经网络架构，并结合了可解释性组件（如分层相关性传播），以提取对癌症类型预测最相关的特征。通过十折交叉验证和广泛的网格搜索，我们优化了神经网络架构，使用驱动基因突变、突变特征和拓扑突变信息作为输入。研究采用PCAWG数据集作为输入，以区分17个原发部位和24种癌症类型。结果：总体而言，我们的方法表明，相较于外显子组信息，使用全基因组或基因间和内含子基因组区域时，区分癌症类型的最相关突变信息增加了超过10%。此外，除两种癌症类型外，大多数癌症类型的最相关特征存在于突变特征中，而非拓扑突变信息。结论：与驱动基因突变相比，信息性突变特征在癌症类型预测方面表现更优，并为诊断信息增添了新维度。由于突变特征中的信息量不仅基于发生频率，甚至可能通过不同的相关突变将同一原发部位的癌症类型区分开来。此外，通过比较信息性突变特征，能够将特定受损修复机制与癌症类型进行对应。

原文链接：

Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification

……

文章目录

文章：

可解释人工智能模型揭示用于癌症类型分类的信息性突变特征

Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification

原文发布日期：22 May 2025

DOI: 10.3390/cancers17111731

类型: Article

开放获取: 是

英文摘要：

摘要翻译：

原文链接：

相关文章

关于我们

官方邮箱

商务合作