건강검진 자료에서 Random forests를 이용한 백내장 발생 위험군 예측모형

YUHSpace

BROWSE

1106 788

Cited 0 times in

건강검진 자료에서 Random forests를 이용한 백내장 발생 위험군 예측모형

Other Titles: Screening test data analysis for cataract happening prediction model using Random forests

Authors: 한은정

Issue Date: 2005

Description: 의학전산통계학협동과정 의학전산통계학전공/석사

Abstract: [한글]

백내장 질환은 국가경제와 문화의 발달로 노령인구가 증가하고 있는 시점에서 사회, 경제적으로 심각한 문제로 부각되고 있는 질병으로 건강검진 자료를 이용하여 백내장 질환에 대한 조기 진단에 이루어진다면 발병률을 크게 줄일 수 있는 질병이다. 본 논문에서는 1994년부터 2001년까지 세브란스병원에서 건강검진을 받은 62.555명의 건강검진 자료(screening test data)를 토대로 백내장 발생 위험군에 대한 예측모형을 구축 및 risk factor(위험인자) 규명에 관하여 연구하였다. 백내장 발생 위험군을 두 가지 관점에서 분석할 수 있었는데 첫째, 건강검진의 자료를 통해 차후에 백내장이 걸릴 위험이 있는 검진자를 백내장 위험군으로 정의하여 이에 대한 예측모형 추정 및 risk factor를 규명하였고 둘째, 이들 중 다시 세브란스에 내원한 검진자를 대상으로 실제 백내장에 걸린 검진자를 백내장 질환군으로 정의하여 백내장 질환 예측모형을 추정하고 risk factor를 규명하였다. 예측모형 추정에는 Random Forests기법을 사용하였고 기존의 데이타마이닝(datamining)기법인 로지스틱 회귀분석, 판별분석, Decision Tree(의사결정 나무), NaiveBayes, Bagging, Arcing과 그 성능을 비교 분석하였다.

Random Forests를 이용한 백내장 위험군 예측모형은 정확도 64.7%와 민감도 53.31%였고 risk factor는 나이, albumin, AST, creatinine, Ca, Cl 등 신체와 관련된 모든 항목이 risk factor가 될 수 있었다.

백내장 질환 예측모형은 정확도가 67.16%, 민감도가 72.28%였고 risk factor는 나이, glucose, WBC, platelet(혈소판수치), triglyceride(중성 지질), BMI였다. 이 결과는 위 예측모형을 통해 의사의 진단 없이 건강검진 자료만을 통해서 백내장 질환 유･무에 관한 정보를 70% 정도 예측할 수 있음을 보인다.

[영문]Cataracts is becoming economic and social problem seriously because national economy growth and cultural development are inducing increase of ageing population. But also this incidence of cataracts can be reduced sharply through early diagnosis using medical check-up data. This dissertation has been studied for cataract happening risk group and risk factor of predict model based on the screening data which was collected from the patients who were had screening test from 1994 to 2001.

cataract happening risk group was analyzed. First, Possible cataract risk group was defined in order to predict cataract risk predict model from screening test. Second, from this group, cataract disease group was defined by the patients who were actually suffering from cataract to predict cataract disease predict model and to find out fisk factor.

Forecasting model of cataract was used by random forest technique, and compared the efficiency between this model and other existing datamining ways, like logistic regression, discriminant analysis, Decision Tree, NaiveBayes, Bagging, Arcing.

As for random forests was 64.7% in accuracy and 53.31% in sensitivity, risk factors were age, albumin, AST, creatinine, Ca, Cl, and so on. As mentioned, the factors were possibly related with body.

Cataract disease predict model was the accuracy of risk factors was 67.16% and sensitivity was 72.68%, risk factors were age, albumin, WBC, platelet, triglyceride, BMI.

This result show that we can predict 70% about Cataract disease existence by screening test.

Files in This Item:: T008582.pdf Download

Appears in Collections:: 1. College of Medicine (의과대학) > Others (기타) > 2. Thesis

URI: https://ir.ymlib.yonsei.ac.kr/handle/22282913/136847

사서에게 알리기

Show full item record Find it @ YMLIB

License

YUHSpace: 건강검진 자료에서 Random forests를 이용한 백내장 발생 위험군 예측모형

YUHSpace

BROWSE

Browse

Links