Cited 0 times in 
Cited 0 times in 
Key Measures for Evaluating Diagnostic Accuracy in Multi-Class Classification: An Overview and Simulation-Based Comparison
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Ryu, Leeha | - |
| dc.contributor.author | Han, Kyunghwa | - |
| dc.contributor.author | Jung, Inkyung | - |
| dc.contributor.author | Park, Yae Won | - |
| dc.contributor.author | Ahn, Sung Soo | - |
| dc.contributor.author | Hwang, Dosik | - |
| dc.contributor.author | 박예원 | - |
| dc.date.accessioned | 2026-04-07T02:08:18Z | - |
| dc.date.available | 2026-04-07T02:08:18Z | - |
| dc.date.created | 2026-04-01 | - |
| dc.date.issued | 2026-04 | - |
| dc.identifier.issn | 1229-6929 | - |
| dc.identifier.uri | https://ir.ymlib.yonsei.ac.kr/handle/22282913/211784 | - |
| dc.description.abstract | Recent advancements in artificial intelligence have led to increased interest in predictive modeling across various domains, including medicine. Although numerous metrics have been established for binary classification, the growing adoption of multi-class systems necessitates robust evaluation measures. However, comprehensive simulation studies comparing the performance of existing multi-class metrics under diverse data conditions remain limited. In this study, we first provide a concise overview of commonly used accuracy metrics for multi-class classification. Then, we report a simulation study that systematically evaluates several diagnostic accuracy measures under a wide range of scenarios, including three-and five-class settings, balanced and imbalanced sample sizes, and different distributional assumptions for predictors. We assessed each metric's performance in terms of bias and 95% confidence interval coverage. Under balanced conditions, most metrics demonstrated stable and unbiased performance, closely approximating the true values. However, under imbalanced conditions, greater bias was observed, with the M-index and polytomous discrimination index exhibiting comparatively more stable performance across various scenarios. The micro-averaged receiver operating characteristic curve area consistently showed higher bias under class imbalance. Finally, we applied these metrics to a glioma tumor grading task using external datasets. This study provides a systematic comparison of commonly used metrics and offers practical guidance for selecting appropriate measures in multi-class classification tasks. | - |
| dc.language | English | - |
| dc.publisher | Korean Society of Radiology | - |
| dc.relation.isPartOf | KOREAN JOURNAL OF RADIOLOGY | - |
| dc.relation.isPartOf | KOREAN JOURNAL OF RADIOLOGY | - |
| dc.subject.MESH | Artificial Intelligence* | - |
| dc.subject.MESH | Brain Neoplasms* / diagnostic imaging | - |
| dc.subject.MESH | Brain Neoplasms* / pathology | - |
| dc.subject.MESH | Computer Simulation | - |
| dc.subject.MESH | Glioma / diagnostic imaging | - |
| dc.subject.MESH | Glioma / pathology | - |
| dc.subject.MESH | Humans | - |
| dc.subject.MESH | ROC Curve | - |
| dc.title | Key Measures for Evaluating Diagnostic Accuracy in Multi-Class Classification: An Overview and Simulation-Based Comparison | - |
| dc.type | Article | - |
| dc.contributor.googleauthor | Ryu, Leeha | - |
| dc.contributor.googleauthor | Han, Kyunghwa | - |
| dc.contributor.googleauthor | Jung, Inkyung | - |
| dc.contributor.googleauthor | Park, Yae Won | - |
| dc.contributor.googleauthor | Ahn, Sung Soo | - |
| dc.contributor.googleauthor | Hwang, Dosik | - |
| dc.identifier.doi | 10.3348/kjr.2025.1447 | - |
| dc.relation.journalcode | J02884 | - |
| dc.identifier.eissn | 2005-8330 | - |
| dc.identifier.pmid | 41914484 | - |
| dc.subject.keyword | Multiclass classification | - |
| dc.subject.keyword | Polytomous outcome prediction | - |
| dc.subject.keyword | Accuracy | - |
| dc.subject.keyword | Performance | - |
| dc.subject.keyword | Metrics | - |
| dc.subject.keyword | Measure | - |
| dc.subject.keyword | Index | - |
| dc.contributor.affiliatedAuthor | Han, Kyunghwa | - |
| dc.contributor.affiliatedAuthor | Jung, Inkyung | - |
| dc.contributor.affiliatedAuthor | Park, Yae Won | - |
| dc.contributor.affiliatedAuthor | Ahn, Sung Soo | - |
| dc.contributor.affiliatedAuthor | Hwang, Dosik | - |
| dc.identifier.wosid | 001724504800006 | - |
| dc.citation.volume | 27 | - |
| dc.citation.number | 4 | - |
| dc.citation.startPage | 344 | - |
| dc.citation.endPage | 355 | - |
| dc.identifier.bibliographicCitation | KOREAN JOURNAL OF RADIOLOGY, Vol.27(4) : 344-355, 2026-04 | - |
| dc.identifier.rimsid | 92273 | - |
| dc.type.rims | ART | - |
| dc.description.journalClass | 1 | - |
| dc.description.journalClass | 1 | - |
| dc.subject.keywordAuthor | Multiclass classification | - |
| dc.subject.keywordAuthor | Polytomous outcome prediction | - |
| dc.subject.keywordAuthor | Accuracy | - |
| dc.subject.keywordAuthor | Performance | - |
| dc.subject.keywordAuthor | Metrics | - |
| dc.subject.keywordAuthor | Measure | - |
| dc.subject.keywordAuthor | Index | - |
| dc.subject.keywordPlus | PERFORMANCE | - |
| dc.subject.keywordPlus | PREDICTION | - |
| dc.subject.keywordPlus | CANCER | - |
| dc.subject.keywordPlus | MODEL | - |
| dc.type.docType | Article | - |
| dc.identifier.kciid | ART003316228 | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.description.journalRegisteredClass | kci | - |
| dc.relation.journalWebOfScienceCategory | Radiology, Nuclear Medicine & Medical Imaging | - |
| dc.relation.journalResearchArea | Radiology, Nuclear Medicine & Medical Imaging | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.