0 1

Cited 0 times in

Cited 0 times in

Evaluation of Context-Aware Prompting Techniques for Classification of Tumor Response Categories in Radiology Reports Using Large Language Model

Authors
 Park, Jiwoo  ;  Sim, Woo Seob  ;  Yu, Jae Yong  ;  Park, Yu Rang  ;  Lee, Young Han 
Citation
 JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025-09 
Journal Title
JOURNAL OF IMAGING INFORMATICS IN MEDICINE
ISSN
 2948-2925 
Issue Date
2025-09
Keywords
Large language model ; Natural language processing ; Radiologic report ; Artificial intelligence ; Disease progression
Abstract
Radiology reports are essential for medical decision-making, providing crucial data for diagnosing diseases, devising treatment plans, and monitoring disease progression. While large language models (LLMs) have shown promise in processing free-text reports, research on effective prompting techniques for radiologic applications remains limited. To evaluate the effectiveness of LLM-driven classification based on radiology reports in terms of tumor response category (TRC), and to optimize the model through a comparison of four different prompt engineering techniques for effectively performing this classification task in clinical applications, we included 3062 whole-spine contrast-enhanced magnetic resonance imaging (MRI) radiology reports for prompt engineering and validation. TRCs were labeled by two radiologists based on criteria modified from the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines. The Llama3 instruct model was used to classify TRCs in this study through four different prompts: General, In-Context Learning (ICL), Chain-of-Thought (CoT), and ICL with CoT. AUROC, accuracy, precision, recall, and F1-score were calculated against each prompt and model (8B, 70B) with the test report dataset. The average AUROC for ICL (0.96 internally, 0.93 externally) and ICL with CoT prompts (0.97 internally, 0.94 externally) outperformed other prompts. Error increased with prompt complexity, including 0.8% incomplete sentence errors and 11.3% probability-classification inconsistencies. This study demonstrates that context-aware LLM prompts substantially improved the efficiency and effectiveness of classifying TRCs from radiology reports, despite potential intrinsic hallucinations. While further improvements are required for real-world application, our findings suggest that context-aware prompts have significant potential for segmenting complex radiology reports and enhancing oncology clinical workflows.
Full Text
https://link.springer.com/article/10.1007/s10278-025-01685-2
DOI
10.1007/s10278-025-01685-2
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers
1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers
Yonsei Authors
Park, Yu Rang(박유랑) ORCID logo https://orcid.org/0000-0002-4210-2094
Park, Jiwoo(박지우)
Yu, Jae Yong(유재용)
Lee, Young Han(이영한) ORCID logo https://orcid.org/0000-0002-5602-391X
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/209798
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links