Modified k-NN approaches for multi class classification

YUHSpace

BROWSE

0 435

Cited 0 times in

Modified k-NN approaches for multi class classification

Authors: 이현영

Issue Date: 2015

Description: 의과대학/박사

Abstract: Multi class classification has several problems which are difficult to isolate, that reduce performance; many researchers have tried to address these issues. On the side of informatization speed, high dimensionality and applicability to data, a non-parametric approach is more suitable for multi class classifications. In this study, the k nearest neighbor (k-NN) learning algorithm was used, and we tried to further improve k-NN performance in the case of problems with a higher tie probability, small data size and inequality of class distribution. Furthermore, we attempted to clarify disease susceptibility with multi-labeling. Therefore, we suggest that the weighted similarity, which considers a predictor’s strength (PS) with mutual information according to the relationship of the true class with predictors, and the distance-weighted voting system, which is considered an individual distance (ID) among k nearest sets, together allow for a distance ratio. Regarding disease susceptibility, we introduce a pending region for multiple labelling. Gower’s distance was applied to k-NN. The proposed methods were compared with support vector machine (SVM) and linear discrimination analysis (LDA).

Sixty-four simulation sets were constructed with several problems such as sample size, combinations of coefficients, correlation strengths, inequality of class distribution and number of predictors. The CREDOS study data set was used and evaluated for a pending region to clarify disease susceptibility. The proposed methods i.e., PS and ID, improved k-NN ability and obtained better results than SVM and LDA did. Furthermore, ID markedly reduced of the probability of tie instances, reducing the gap between accuracy and recall. In the CREDOS (Clinical Research Center for Dementia of South Korea) study data set, k-NN with PS+ID outperformed SVM and LDA. With the pending regions as 0.0%, 2.5%, 5.0%, 7.5% and 10.0%, recall showed marked elevation, which did not exceed 0.40. As results, we obtained five labeling sets, namely AD, MCI+AD, MCI, SMI+(MCI or AD) and SMI, that were reflected in disease susceptibility of AD. The disease susceptibility showed significant association with true disease and other clinical assessments that were not included classification model.

The modified k-NN i.e., weighted similarity and distance-weighted voting system, can improve multi class classification ability, and it showed comparable results than LDA and SVM. Introducing pending regions may help in detecting disease susceptibility and may offer clue to solving disease progression.

Appears in Collections:: 1. College of Medicine (의과대학) > Others (기타) > 3. Dissertation

URI: https://ir.ymlib.yonsei.ac.kr/handle/22282913/148736

사서에게 알리기

Show full item record Find it @ YMLIB

License

YUHSpace: Modified k-NN approaches for multi class classification

YUHSpace

BROWSE

Browse

Links