24 30

Cited 0 times in

Cited 0 times in

Uncover This Tech Term: Large Vision-Language Models in Radiology

Authors
 Faghani, Shahriar  ;  Park, Yae Won  ;  Park, Ji Eun 
Citation
 KOREAN JOURNAL OF RADIOLOGY, Vol.27(4) : 375-378, 2026-04 
Journal Title
KOREAN JOURNAL OF RADIOLOGY
ISSN
 1229-6929 
Issue Date
2026-04
Keywords
Large vision-language model ; Vision-language model ; Large multimodal model ; Large language model ; Artificial intelligence ; Transformer
Abstract
Large multimodal models are typically transformer-based foundational models that can process and generate multiple types of data (modalities), including text, images, audio, and video [1,2]. Large vision-language models (LVLMs) are a subset of large multimodal models that specifically focus on aligning and integrating visual and linguistic systems are trained to perform well-defined narrow tasks and have limited adaptability. By contrast, LVLMs generalize across diverse tasks and support flexible downstream applications without requiring task-specific retraining.
Files in This Item:
92270.pdf Download
DOI
10.3348/kjr.2025.1813
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers
Yonsei Authors
Park, Yae Won(박예원) ORCID logo https://orcid.org/0000-0001-8907-5401
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/211787
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links