Purpose: This study aimed to develop a retrained large language model (LLM) tailored to the needs of HN cancer patients treated with radiotherapy, with emphasis on symptom management and survivorship care. Methods: A comprehensive external database was curated for training ChatGPT-4, integrating expert-identified consensus guidelines on supportive care for HN patients and correspondences from physicians and nurses within our institution’s electronic medical records for 90 HN patients. The performance of our model was evaluated using 20 patient post-treatment inquiries that were then assessed by three Board certified radiation oncologists (RadOncs). The rating of the model was assessed on a scale of 1 (strongly disagree) to 5 (strongly agree) based on accuracy, clarity of response, completeness s, and relevance. Results: The average scores for the 20 tested questions were 4.25 for accuracy, 4.35 for clarity, 4.22 for completeness, and 4.32 for relevance, on a 5-point scale. Overall, 91.67% (220 out of 240) of assessments received scores of 3 or higher, and 83.33% (200 out of 240) received scores of 4 or higher. Conclusion: The custom-trained model demonstrates high accuracy in providing support to HN patients offering evidence-based information and guidance on their symptom management and survivorship care.
目的:本研究旨在开发一种针对接受放射治疗的头颈部(HN)癌症患者需求进行再训练的大型语言模型(LLM),重点关注症状管理与生存期照护。方法:我们构建了一个全面的外部数据库用于训练ChatGPT-4,该数据库整合了专家认可的头颈部患者支持性照护共识指南,以及来自我院电子病历系统中90例头颈部癌症患者的医护沟通记录。模型性能通过20条患者治疗后咨询问题进行评估,并由三位获得委员会认证的放射肿瘤科医生(RadOncs)进行评判。评估采用1分(强烈不同意)至5分(强烈同意)的李克特量表,从回答准确性、清晰度、完整性和相关性四个维度进行评分。结果:在5分制评分中,20个测试问题的平均得分分别为:准确性4.25分、清晰度4.35分、完整性4.22分、相关性4.32分。总体而言,91.67%(240项评估中的220项)获得3分及以上评分,83.33%(240项中的200项)获得4分及以上评分。结论:定制训练模型在向头颈部癌症患者提供基于证据的症状管理与生存期照护信息和指导方面表现出较高的准确性。