5 9

Cited 0 times in

Cited 0 times in

Comparative evaluation of generative artificial intelligence models for synthetic knee radiograph augmentation in clinical research

Authors
 Chung, Kwangho  ;  Nam, Ji-Hoon  ;  Dosset, Arailym  ;  Koh, Yong-Gon  ;  Kim, Jae Min  ;  Kim, Paul Shinil  ;  Lee, Jin Woo  ;  Park, Kyoung-Mi  ;  Kwon, Hyuck Min  ;  Kang, Kyoung-Tak 
Citation
 BMC MEDICAL IMAGING, Vol.26(1), 2026-03 
Article Number
 182 
Journal Title
BMC MEDICAL IMAGING
ISSN
 1471-2342 
Issue Date
2026-03
Keywords
X-ray ; DCGAN ; StyleGAN3 ; CycleGAN ; Knee
Abstract
Background In this study, the capability of state-of-the-art generative models to synthesize realistic knee radiographs was evaluated to address dataset scarcity in osteoarthritis (OA) research. Methods Three generative frameworks-Style Generative Adversarial Network3 (StyleGAN3), a stable diffusion + Cycle-consistent Generative Adversarial Network (CycleGAN) pipeline, and Deep Convolutional Generative Adversarial Network (DCGAN)-were trained on 10,042 real knee X-rays. Image quality was assessed using Fr & eacute;chet Inception Distance (FID) while visual fidelity was evaluated via a Visual Turing Test conducted by two orthopedic surgeons and a musculoskeletal radiologist. Joint Line Convergence Angle (JLCA) was compared between real and synthetic images for anatomical fidelity. Inter- and intra-observer reliability for JLCA was measured using intraclass correlation coefficients (ICC). Results StyleGAN3 achieved the best performance (FID 10.84), showing high visual and anatomical fidelity. Integrating Stable Diffusion with CycleGAN showed a moderate FID of 39.79, suggesting that adversarial enhancements improved the diffusion-based synthesis. DCGAN showed lower quality, achieving an FID of 74.15. Expert accuracy in distinguishing real from synthetic images ranged between 36% and 88%, confirming difficulty in visual differentiation. Furthermore, JLCA measurements showed no significant difference between real (4.19 +/- 3.07 degrees) and synthetic (3.36 +/- 2.19 degrees) images generated by DCGAN (p = 0.12). Similarly, Diffusion + CycleGAN (3.91 +/- 2.59 degrees vs. 3.72 +/- 2.52 degrees, p = 1.00) and StyleGAN3 (4.27 +/- 3.01 degrees vs. 3.60 +/- 2.37 degrees, p = 0.25) showed no statistically significant differences. These results indicate that all elevated generative models maintained high anatomical fidelity relative to real radiographs. Inter-observer agreement was strong, with ICC values ranging between 0.83 and 0.97. Intra-observer reliability was also excellent. Conclusion StyleGAN3 generated the most realistic knee radiographs. Diffusion-based pipelines showed promising results when enhanced with adversarial networks. These findings underscore the potential of generative AI to mitigate data limitations in orthopedic research.
Files in This Item:
92533.pdf Download
DOI
10.1186/s12880-026-02244-z
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Orthopedic Surgery (정형외과학교실) > 1. Journal Papers
Yonsei Authors
Kwon, Hyuck Min(권혁민) ORCID logo https://orcid.org/0000-0002-2924-280X
Lee, Jin Woo(이진우) ORCID logo https://orcid.org/0000-0002-0293-9017
Chung, Kwangho(정광호) ORCID logo https://orcid.org/0000-0003-3097-3332
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/212007
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links