5 10

Cited 0 times in

Cited 0 times in

Insufficient reporting quality in large language model studies in the field of radiology

Authors
 Suh, Pae Sun  ;  Jeong, So Yeong  ;  Ueda, Daiju  ;  Shim, Woo Hyun  ;  Heo, Hwon  ;  Woo, Chang-Yun  ;  Park, Hyungjun  ;  Suh, Chong Hyun 
Citation
 INSIGHTS INTO IMAGING, Vol.17(1), 2026-03 
Article Number
 71 
Journal Title
INSIGHTS INTO IMAGING
ISSN
 1869-4101 
Issue Date
2026-03
Keywords
Large language model ; Radiology ; Reporting quality ; Systematic review
Abstract
ObjectivesOur systematic review aimed to evaluate the quality of reporting in research articles involving LLMs in the radiology field.Materials and methodsAfter searching the PubMed-MEDLINE and EMBASE databases, a total of 246 eligible studies published between November 30, 2022, and December 31, 2024, were included. The analysis assessed the percentage of studies adhering to key elements required for LLM research, based on the MInimum reporting items for CLear Evaluation of Accuracy Reports of Large Language Models in healthcare (MI-CLEAR-LLM) and the Transparent Reporting of a Multivariable Model for Individual Prognosis Or Diagnosis-large language models (TRIPOD-LLM) checklists. Studies published before and after July 25, 2024, were compared using a chi-square test.ResultsThe most common topic was performance evaluation of LLMs using radiologic cases (44.3%, 109/246), followed by radiology reporting (37.8%, 93/246). Although all studies reported LLM's name, only 27.6% (68/246) specified the model version, 35.8% (88/246) mentioned access date, and 25.2% (62/246) mentioned application programming interface usage. Full prompts were provided in 41.1% (101/246) of studies. Output probability-related issues, including the number of attempts (22.8%, 56/246) and factors such as temperature (16.7%, 41/246), were under-reported. These reporting insufficiencies persisted in studies published before and after July 25, 2024.ConclusionMost studies assessing large language models in radiology lacked sufficient reporting of key elements required for large language model research. We recommend that authors strive to adhere to these elements to ensure transparency and improve the reproducibility of future studies.Critical relevance statementOur study highlighted the need for improved reporting quality and adherence to key elements to ensure transparent reporting and improve the reproducibility of future studies using large language models.Key PointsNumerous studies on large language models (LLMs) in radiology lack standardized methodologies, leading to high variability and inconsistent reporting.Our review demonstrated insufficiency in key elements for LLM research, particularly in model details and output probability.Better reporting and adherence to key elements are essential for enhancing transparency and reproducibility in future LLM research.
Files in This Item:
92181.pdf Download
DOI
10.1186/s13244-026-02236-1
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers
Yonsei Authors
Suh, Pae Sun(서배선) ORCID logo https://orcid.org/0000-0002-8618-9558
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/211663
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links