0 0

Cited 0 times in

Cited 0 times in

Improving mortality prediction after radiotherapy with large language model structuring of large-scale unstructured electronic health records

Authors
 Park, Sangjoon  ;  Wee, Chan Woo  ;  Choi, Seo Hee  ;  Kim, Kyung Hwan  ;  Chang, Jee Suk  ;  Yoon, Hong In  ;  Lee, Ik Jae  ;  Kim, Yong Bae  ;  Cho, Jaeho  ;  Keum, Ki Chang  ;  Lee, Chang Geol  ;  Byun, Hwa Kyung  ;  Koom, Woong Sub 
Citation
 RADIOTHERAPY AND ONCOLOGY, Vol.211, 2025-10 
Journal Title
RADIOTHERAPY AND ONCOLOGY
ISSN
 0167-8140 
Issue Date
2025-10
MeSH
Aged ; Deep Learning ; Electronic Health Records* ; Female ; Humans ; Large Language Models ; Machine Learning ; Male ; Middle Aged ; Neoplasms* / mortality ; Neoplasms* / radiotherapy
Keywords
Large language models ; Electronic health records ; Data structurization ; Radiotherapy ; Survival prediction
Abstract
Background and purpose: Avoiding unnecessary radiotherapy (RT) in patients with limited life expectancy requires accurate selection. Traditional survival models based on structured data often lack precision. Large language models (LLMs) offer a novel approach to structuring unstructured electronic health record (EHR) data, potentially improving survival predictions by integrating comprehensive clinical information. Materials and methods: We analyzed structured and unstructured data from 34,276 RT-treated patients at Yonsei Cancer Center. An open-source LLM structured unstructured EHR data using single-shot learning. External validation included 852 patients from Yongin Severance Hospital. We compared the LLM's performance against a domain-specific medical LLM and a smaller variant. Survival prediction models using statistical, machine-learning, and deep-learning approaches incorporated both structured and LLM-structured data. Results: The open-source LLM structured unstructured EHR data with 87.5 % accuracy, outperforming the domain-specific medical LLM (35.8 %). Larger LLMs were more effective in structuring clinically relevant features, such as general condition and disease extent, which correlated with survival. Incorporating LLM-structured features improved the deep learning model's C-index from 0.737 to 0.820 (internal validation) and from 0.779 to 0.842 (external validation). Risk stratification was also enhanced, with clearer differentiation among low-, intermediate-, and high-risk groups (p < 0.001). Additionally, models became more interpretable, as key LLM-structured features aligned with statistically significant predictors traditionally identified from structured data. Conclusion: General-domain LLMs, despite not being fine-tuned for medical data, can effectively structure large-scale unstructured EHRs, significantly improving survival prediction accuracy and model interpretability. The RT-Surv framework highlights the potential of LLMs to enhance clinical decision-making and optimize RT treatment.
Full Text
https://www.sciencedirect.com/science/article/pii/S0167814025045566
Article Number
 111052 
DOI
10.1016/j.radonc.2025.111052
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiation Oncology (방사선종양학교실) > 1. Journal Papers
Yonsei Authors
Keum, Ki Chang(금기창) ORCID logo https://orcid.org/0000-0003-4123-7998
Koom, Woong Sub(금웅섭) ORCID logo https://orcid.org/0000-0002-9435-7750
Kim, Kyung Hwan(김경환)
Kim, Yong Bae(김용배) ORCID logo https://orcid.org/0000-0001-7573-6862
Park, Sang Joon(박상준)
Byun, Hwa Kyung(변화경) ORCID logo https://orcid.org/0000-0002-8964-6275
Wee, Chan Woo(위찬우)
Yoon, Hong In(윤홍인) ORCID logo https://orcid.org/0000-0002-2106-6856
Lee, Ik Jae(이익재) ORCID logo https://orcid.org/0000-0001-7165-3373
Lee, Chang Geol(이창걸) ORCID logo https://orcid.org/0000-0002-8702-881X
Chang, Jee Suk(장지석) ORCID logo https://orcid.org/0000-0001-7685-3382
Cho, Jae Ho(조재호) ORCID logo https://orcid.org/0000-0001-9966-5157
Choi, Seo Hee(최서희) ORCID logo https://orcid.org/0000-0002-4083-6414
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/207996
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links