0 87

Cited 0 times in

이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교

Other Titles
 Comparison of resampling methods for dealing with imbalanced data in binary classification problem 
 박근우  ;  정인경 
 Korean Journal of Applied Statistics (응용통계연구), Vol.32(3) : 349-374, 2019 
Journal Title
 Korean Journal of Applied Statistics (응용통계연구) 
Issue Date
A class imbalance problem arises when one class outnumbers the other class by a large proportion in binary data. Studies such as transforming the learning data have been conducted to solve this imbalance problem. In this study, we compared resampling methods among methods to deal with an imbalance in the classification problem. We sought to find a way to more effectively detect the minority class in the data. Through simulation, a total of 20 methods of over-sampling, under-sampling, and combined method of over- and under-sampling were compared. The logistic regression, support vector machine, and random forest models, which are commonly used in classification problems, were used as classifiers. The simulation results showed that the random under sampling (RUS) method had the highest sensitivity with an accuracy over 0.5. The next most sensitive method was an over-sampling adaptive synthetic sampling approach. This revealed that the RUS method was suitable for finding minority class values. The results of applying to some real data sets were similar to those of the simulation.
Full Text
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers
Yonsei Authors
Jung, Inkyung(정인경) ORCID logo https://orcid.org/0000-0003-3780-3213
사서에게 알리기


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.