Controllable Text-to-Image Synthesis for Multi-Modality MR Images

0 488

Cited 0 times in

Authors: Kyuri Kim ; Yoonho Na ; Sung-Joon Ye ; Jimin Lee ; Sung Soo Ahn ; Ji Eun Park

Citation: 2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, : 7921-7930, 2024-04

Journal Title: 2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024

Keywords: Applications ; Biomedical / healthcare / medicine ; Algorithms ; Vision + language and/or other modalities

Abstract: Generative modeling has seen significant advancements in recent years, especially in the realm of text-to-image synthesis. Despite this progress, the medical field has yet to fully leverage the capabilities of large-scale foundational models for synthetic data generation. This paper introduces a framework for text-conditional magnetic resonance (MR) imaging generation, addressing the complexities associated with multi-modality considerations. The framework comprises a pre-trained large language model, a diffusion-based prompt-conditional image generation architecture, and an additional denoising network for input structural binary masks. Experimental results demonstrate that the proposed framework is capable of generating realistic, high-resolution, and high-fidelity multi-modal MR images that align with medical language text prompts. Further, the study interprets the cross-attention maps of the generated results based on text-conditional statements. The contributions of this research lay a robust foundation for future studies in text-conditional medical image generation and hold significant promise for accelerating advancements in medical imaging research.

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Neurosurgery (신경외과학교실) > 1. Journal Papers
1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers

Yonsei Authors: Kim, Hwiyoung(김휘영)
Ahn, Sung Soo(안성수) https://orcid.org/0000-0002-0503-5558

Show full item record Find it @ YMLIB

YUHSpace: Controllable Text-to-Image Synthesis for Multi-Modality MR Images