Background/Objectives:Proper confidence interval estimation of the area under the receiver operating characteristic curve (AUC), the net reclassification index (NRI), and the integrated discrimination improvement (IDI) is an area of ongoing research. The most common confidence interval estimation methods employ asymptotic theory. However, developments demonstrate that degeneration of the normal distribution assumption under the null hypothesis exists for measures such as the change in AUC (ΔAUC) and IDI, and confidence intervals estimated under the normal distribution assumption may be invalid. We aim to study the performance of confidence intervals derived assuming asymptotic theory and those derived with non-parametric bootstrapping methods.Methods:We examine the performance of ΔAUC, NRI, and IDI in both the logistic and survival regression context. We explore empirical distributions and compare coverage probabilities of asymptotic confidence intervals with those produced from bootstrapping methods through simulation.Results:The primary finding in both the logistic framework and the survival analysis framework is that the percentile CIs performed well regarding coverage, without compromise to their width; this finding was robust in most scenarios.Conclusions:Our results suggest that the asymptotic intervals are only appropriate when a strong effect size of the added parameter exists, and that the percentile bootstrap interval exhibits at least a reasonable coverage while maintaining the shortest width in nearly all simulated scenarios, making this interval the most reliable choice. The intent is that these recommendations improve the accuracy in the estimation and the overall assessment of discrimination improvement.
背景/目的:对受试者工作特征曲线下面积(AUC)、净重分类指数(NRI)和综合判别改善指数(IDI)进行准确的置信区间估计是当前持续研究的热点领域。最常见的置信区间估计方法采用渐近理论。然而,研究进展表明,在零假设下,对于AUC变化值(ΔAUC)和IDI等指标,正态分布假设存在退化现象,基于正态分布假设估计的置信区间可能无效。本研究旨在比较基于渐近理论推导的置信区间与采用非参数自助法所得置信区间的性能表现。 方法:我们在逻辑回归和生存回归两种框架下,检验ΔAUC、NRI和IDI的性能表现。通过模拟研究,我们探索了这些指标的经验分布,并比较了渐近置信区间与自助法所得置信区间的覆盖概率。 结果:在逻辑回归框架和生存分析框架中,主要发现均为百分位数置信区间在覆盖概率方面表现良好,且未牺牲区间宽度;这一发现在大多数模拟情境中均保持稳健。 结论:我们的结果表明,渐近置信区间仅当新增参数具有强效应量时才适用,而百分位数自助置信区间在几乎所有模拟情境中均能保持合理的覆盖概率,同时维持最短的区间宽度,这使其成为最可靠的选择。本研究旨在通过这些建议,提高判别改善评估中估计的准确性和整体评价质量。
Improving the Estimation of Prediction Increment Measures in Logistic and Survival Analysis