0 15

Cited 0 times in

Cited 0 times in

Key Measures for Evaluating Diagnostic Accuracy in Multi-Class Classification: An Overview and Simulation-Based Comparison

DC Field Value Language
dc.contributor.authorRyu, Leeha-
dc.contributor.authorHan, Kyunghwa-
dc.contributor.authorJung, Inkyung-
dc.contributor.authorPark, Yae Won-
dc.contributor.authorAhn, Sung Soo-
dc.contributor.authorHwang, Dosik-
dc.contributor.author박예원-
dc.date.accessioned2026-04-07T02:08:18Z-
dc.date.available2026-04-07T02:08:18Z-
dc.date.created2026-04-01-
dc.date.issued2026-04-
dc.identifier.issn1229-6929-
dc.identifier.urihttps://ir.ymlib.yonsei.ac.kr/handle/22282913/211784-
dc.description.abstractRecent advancements in artificial intelligence have led to increased interest in predictive modeling across various domains, including medicine. Although numerous metrics have been established for binary classification, the growing adoption of multi-class systems necessitates robust evaluation measures. However, comprehensive simulation studies comparing the performance of existing multi-class metrics under diverse data conditions remain limited. In this study, we first provide a concise overview of commonly used accuracy metrics for multi-class classification. Then, we report a simulation study that systematically evaluates several diagnostic accuracy measures under a wide range of scenarios, including three-and five-class settings, balanced and imbalanced sample sizes, and different distributional assumptions for predictors. We assessed each metric's performance in terms of bias and 95% confidence interval coverage. Under balanced conditions, most metrics demonstrated stable and unbiased performance, closely approximating the true values. However, under imbalanced conditions, greater bias was observed, with the M-index and polytomous discrimination index exhibiting comparatively more stable performance across various scenarios. The micro-averaged receiver operating characteristic curve area consistently showed higher bias under class imbalance. Finally, we applied these metrics to a glioma tumor grading task using external datasets. This study provides a systematic comparison of commonly used metrics and offers practical guidance for selecting appropriate measures in multi-class classification tasks.-
dc.languageEnglish-
dc.publisherKorean Society of Radiology-
dc.relation.isPartOfKOREAN JOURNAL OF RADIOLOGY-
dc.relation.isPartOfKOREAN JOURNAL OF RADIOLOGY-
dc.subject.MESHArtificial Intelligence*-
dc.subject.MESHBrain Neoplasms* / diagnostic imaging-
dc.subject.MESHBrain Neoplasms* / pathology-
dc.subject.MESHComputer Simulation-
dc.subject.MESHGlioma / diagnostic imaging-
dc.subject.MESHGlioma / pathology-
dc.subject.MESHHumans-
dc.subject.MESHROC Curve-
dc.titleKey Measures for Evaluating Diagnostic Accuracy in Multi-Class Classification: An Overview and Simulation-Based Comparison-
dc.typeArticle-
dc.contributor.googleauthorRyu, Leeha-
dc.contributor.googleauthorHan, Kyunghwa-
dc.contributor.googleauthorJung, Inkyung-
dc.contributor.googleauthorPark, Yae Won-
dc.contributor.googleauthorAhn, Sung Soo-
dc.contributor.googleauthorHwang, Dosik-
dc.identifier.doi10.3348/kjr.2025.1447-
dc.relation.journalcodeJ02884-
dc.identifier.eissn2005-8330-
dc.identifier.pmid41914484-
dc.subject.keywordMulticlass classification-
dc.subject.keywordPolytomous outcome prediction-
dc.subject.keywordAccuracy-
dc.subject.keywordPerformance-
dc.subject.keywordMetrics-
dc.subject.keywordMeasure-
dc.subject.keywordIndex-
dc.contributor.affiliatedAuthorHan, Kyunghwa-
dc.contributor.affiliatedAuthorJung, Inkyung-
dc.contributor.affiliatedAuthorPark, Yae Won-
dc.contributor.affiliatedAuthorAhn, Sung Soo-
dc.contributor.affiliatedAuthorHwang, Dosik-
dc.identifier.wosid001724504800006-
dc.citation.volume27-
dc.citation.number4-
dc.citation.startPage344-
dc.citation.endPage355-
dc.identifier.bibliographicCitationKOREAN JOURNAL OF RADIOLOGY, Vol.27(4) : 344-355, 2026-04-
dc.identifier.rimsid92273-
dc.type.rimsART-
dc.description.journalClass1-
dc.description.journalClass1-
dc.subject.keywordAuthorMulticlass classification-
dc.subject.keywordAuthorPolytomous outcome prediction-
dc.subject.keywordAuthorAccuracy-
dc.subject.keywordAuthorPerformance-
dc.subject.keywordAuthorMetrics-
dc.subject.keywordAuthorMeasure-
dc.subject.keywordAuthorIndex-
dc.subject.keywordPlusPERFORMANCE-
dc.subject.keywordPlusPREDICTION-
dc.subject.keywordPlusCANCER-
dc.subject.keywordPlusMODEL-
dc.type.docTypeArticle-
dc.identifier.kciidART003316228-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.description.journalRegisteredClasskci-
dc.relation.journalWebOfScienceCategoryRadiology, Nuclear Medicine & Medical Imaging-
dc.relation.journalResearchAreaRadiology, Nuclear Medicine & Medical Imaging-
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers
1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.