Cited 5 times in

Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology

DC Field Value Language
dc.contributor.author박상준-
dc.date.accessioned2024-05-23T03:05:39Z-
dc.date.available2024-05-23T03:05:39Z-
dc.date.issued2024-01-
dc.identifier.issn1361-8415-
dc.identifier.urihttps://ir.ymlib.yonsei.ac.kr/handle/22282913/199156-
dc.description.abstractThe escalating demand for artificial intelligence (AI) systems that can monitor and supervise human errors and abnormalities in healthcare presents unique challenges. Recent advances in vision-language models reveal the challenges of monitoring AI by understanding both visual and textual concepts and their semantic correspondences. However, there has been limited success in the application of vision-language models in the medical domain. Current vision-language models and learning strategies for photographic images and captions call for a web-scale data corpus of image and text pairs which is not often feasible in the medical domain. To address this, we present a model named medical cross-attention vision-language model (Medical X-VL), which leverages key components to be tailored for the medical domain. The model is based on the following components: self-supervised unimodal models in medical domain and a fusion encoder to bridge them, momentum distillation, sentencewise contrastive learning for medical reports, and sentence similarity-adjusted hard negative mining. We experimentally demonstrated that our model enables various zero-shot tasks for monitoring AI, ranging from the zero-shot classification to zero-shot error correction. Our model outperformed current state-of-the-art models in two medical image datasets, suggesting a novel clinical application of our monitoring AI model to alleviate human errors. Our method demonstrates a more specialized capacity for fine-grained understanding, which presents a distinct advantage particularly applicable to the medical domain.-
dc.description.statementOfResponsibilityrestriction-
dc.languageEnglish-
dc.publisherElsevier-
dc.relation.isPartOfMEDICAL IMAGE ANALYSIS-
dc.rightsCC BY-NC-ND 2.0 KR-
dc.subject.MESHArtificial Intelligence*-
dc.subject.MESHHumans-
dc.subject.MESHLanguage-
dc.subject.MESHLearning-
dc.subject.MESHRadiography-
dc.subject.MESHRadiology*-
dc.titleSelf-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology-
dc.typeArticle-
dc.contributor.collegeCollege of Medicine (의과대학)-
dc.contributor.departmentDept. of Radiation Oncology (방사선종양학교실)-
dc.contributor.googleauthorSangjoon Park-
dc.contributor.googleauthorEun Sun Lee-
dc.contributor.googleauthorKyung Sook Shin-
dc.contributor.googleauthorJeong Eun Lee-
dc.contributor.googleauthorJong Chul Ye-
dc.identifier.doi10.1016/j.media.2023.103021-
dc.contributor.localIdA06513-
dc.relation.journalcodeJ02201-
dc.identifier.eissn1361-8423-
dc.identifier.pmid37952385-
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S1361841523002815-
dc.subject.keywordError detection-
dc.subject.keywordMonitoring AI-
dc.subject.keywordRadiograph-
dc.subject.keywordVision-language model-
dc.contributor.alternativeNamePark, Sang Joon-
dc.contributor.affiliatedAuthor박상준-
dc.citation.volume91-
dc.citation.startPage103021-
dc.identifier.bibliographicCitationMEDICAL IMAGE ANALYSIS, Vol.91 : 103021, 2024-01-
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiation Oncology (방사선종양학교실) > 1. Journal Papers

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.