이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교

박근우; 정인경

doi:10.5351/KJAS.2019.32.3.349

YUHSpace

BROWSE

587 844

Cited 0 times in

이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교

DC Field	Value	Language
dc.contributor.author	정인경	-
dc.date.accessioned	2019-09-20T07:53:54Z	-
dc.date.available	2019-09-20T07:53:54Z	-
dc.date.issued	2019	-
dc.identifier.issn	1225-066X	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/171105	-
dc.description.abstract	A class imbalance problem arises when one class outnumbers the other class by a large proportion in binary data. Studies such as transforming the learning data have been conducted to solve this imbalance problem. In this study, we compared resampling methods among methods to deal with an imbalance in the classification problem. We sought to find a way to more effectively detect the minority class in the data. Through simulation, a total of 20 methods of over-sampling, under-sampling, and combined method of over- and under-sampling were compared. The logistic regression, support vector machine, and random forest models, which are commonly used in classification problems, were used as classifiers. The simulation results showed that the random under sampling (RUS) method had the highest sensitivity with an accuracy over 0.5. The next most sensitive method was an over-sampling adaptive synthetic sampling approach. This revealed that the RUS method was suitable for finding minority class values. The results of applying to some real data sets were similar to those of the simulation.	-
dc.description.statementOfResponsibility	restriction	-
dc.format	application/pdf	-
dc.language	Korean	-
dc.publisher	한국통계학회	-
dc.relation.isPartOf	Korean Journal of Applied Statistics (응용통계연구)	-
dc.rights	CC BY-NC-ND 2.0 KR	-
dc.title	이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교	-
dc.title.alternative	Comparison of resampling methods for dealing with imbalanced data in binary classification problem	-
dc.type	Article	-
dc.contributor.college	College of Medicine (의과대학)	-
dc.contributor.department	Dept. of Biomedical Systems Informatics (의생명시스템정보학교실)	-
dc.contributor.googleauthor	박근우	-
dc.contributor.googleauthor	정인경	-
dc.identifier.doi	10.5351/KJAS.2019.32.3.349	-
dc.contributor.localId	A03693	-
dc.relation.journalcode	J01964	-
dc.identifier.url	http://kiss.kstudy.com/thesis/thesis-view.asp?key=3687102	-
dc.contributor.alternativeName	Jung, In Kyung	-
dc.contributor.affiliatedAuthor	정인경	-
dc.citation.volume	32	-
dc.citation.number	3	-
dc.citation.startPage	349	-
dc.citation.endPage	374	-
dc.identifier.bibliographicCitation	Korean Journal of Applied Statistics (응용통계연구), Vol.32(3) : 349-374, 2019	-
dc.identifier.rimsid	64528	-
dc.type.rims	ART	-

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers

Show simple item record Find it @ YMLIB

License

YUHSpace: 이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교

YUHSpace

BROWSE

Browse

Links