B-cell acute lymphoblastic leukaemia (B-ALL) is characterised by diverse genomic alterations, the most frequent being gene fusions detected via transcriptomic analysis (mRNA-seq). Due to its hypervariable nature, gene fusions involving the Immunoglobulin Heavy Chain (IGH) locus can be difficult to detect with standard gene fusion calling algorithms and significant computational resources and analysis times are required. We aimed to optimize a gene fusion calling workflow to achieve best-case sensitivity forIGHgene fusion detection. Using Nextflow, we developed a simplified workflow containing the algorithms FusionCatcher, Arriba, and STAR-Fusion. We analysed samples from 35 patients harbouringIGHfusions (IGH::CRLF2n = 17,IGH::DUX4n = 15,IGH::EPORn = 3) and assessed the detection rates for each caller, before optimizing the parameters to enhance sensitivity forIGHfusions. Initial results showed that FusionCatcher and Arriba outperformed STAR-Fusion (85–89% vs. 29% ofIGHfusions reported). We found that extensive filtering in STAR-Fusion hinderedIGHreporting. By adjusting specific filtering steps (e.g., read support, fusion fragments per million total reads), we achieved a 94% reporting rate forIGHfusions with STAR-Fusion. This analysis highlights the importance of filtering optimization forIGHgene fusion events, offering alternative workflows for difficult-to-detect high-risk B-ALL subtypes.
B细胞急性淋巴细胞白血病(B-ALL)具有多样化的基因组改变特征,其中最常见的是通过转录组分析(mRNA测序)检测到的基因融合。由于免疫球蛋白重链(IGH)基因座的高度变异性,涉及该位点的基因融合难以通过标准基因融合检测算法有效识别,且通常需要大量计算资源和较长的分析时间。本研究旨在优化基因融合检测流程,以实现对IGH基因融合的最佳检测灵敏度。 我们利用Nextflow开发了一个简化的工作流程,整合了FusionCatcher、Arriba和STAR-Fusion三种算法。通过对35例携带IGH融合(IGH::CRLF2 n=17、IGH::DUX4 n=15、IGH::EPOR n=3)的患者样本进行分析,评估了各检测工具的检出率,并进一步优化参数以提升对IGH融合的检测灵敏度。 初步结果显示,FusionCatcher和Arriba的表现优于STAR-Fusion(IGH融合检出率分别为85–89% vs. 29%)。研究发现,STAR-Fusion中严格的过滤步骤限制了其对IGH融合的识别。通过调整特定过滤参数(如读段支持数、每百万总读段中融合片段数),我们将STAR-Fusion对IGH融合的检出率提升至94%。 本研究凸显了过滤参数优化对检测IGH基因融合事件的重要性,为难以识别的高危B-ALL亚型提供了可替代的分析方案。