Application and Evaluation of a Common Data Model to Multicenter Workers' Health Examination Data

YUHSpace

BROWSE

74 228

Cited 0 times in

Application and Evaluation of a Common Data Model to Multicenter Workers' Health Examination Data

Other Titles: 다기관 특수건강검진 데이터의 공통데이터 모델(Common Data Model) 적용 및 평가

Authors: 심주호

College: College of Medicine (의과대학)

Department: Others

Degree: 박사

Issue Date: 2023-02

Abstract: Introduction Recently, as the amount of data in the pharmaceutical field in Korea has increased, research using real-world data (RWD) called real-world evidence (RWE) research is being conducted. The US is using Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM) for research using medical data. In Korea, research is performed in various fields such as medical device, colorectal cancer, and drug research using the OMOP-CDM. Therefore, this study was conducted to determine if Workers’ Health examination data could be combined with a CDM and be used for research. Methods Data from January 2015 to December 2017 were collected from three university hospitals. A data dictionary was prepared according to the electronic medical record (EMR) status of each institution, and the data were mapped. During data mapping, the values input to each institution were converted to values in the same range according to the data dictionary, and the variable names were also converted to be the same. After that, empirical research was by dividing the study into one using questionnaires and one using exposure to harmful factors. Based on the analytical method used for each subject in the empirical study, regression analysis, survival analysis, and meta-analysis were performed for statistical analyses. Results A mapping table was prepared by dividing the questionnaire into three sections (general questionnaire, night questionnaire, and special examination) in the questionnaire section for each institution. The general questionnaire was divided into five areas, and the total number of variables was 85. The night questionnaire was divided into 5 areas, and the total number of variables was 64. The special examination was divided into 11 areas, and the total number of variables was 40. In the examination results section, the mapping table was produced by dividing it into physical measurements and clinical examinations. Physical measurements included 99 variables, and clinical tests included 231 variables. In the empirical study using questionnaires, a study on the number of consecutive night work days and insomnia, a study on shift interval and insomnia, and a study on insomnia and constipation were conducted. In the empirical study using exposure to harmful factors, a study on noise exposure and fasting blood sugar, a study on noise exposure and hypertension, and a study on dust exposure and diabetes were conducted. Discussion Since there is no official standard code for the Workers’ Health examination questionnaire, a comprehensive data dictionary was created, and the newly created code was used to include all variables for this study. In the variables of the questionnaire, Severance Hospital was mapped to 79.4%, Ulsan University Hospital to 68.2%, and Wonju Severance Hospital to 66.1%. For harmful exposure variables, Severance hospital was mapped to 76.1%, Ulsan University Hospital to 76.1%, and Wonju Severance Hospital to 37.0%. However, Wonju Severance Hospital had a low mapping rate (%) because this study only had a small number of variables. Although there was some dissatisfaction with the lack of information in the study using questionnaires and exposure to harmful factors, the Workers’ Health examination data using a CDM was confirmed to have high validity. Since a CDM can be standardized in the same way even with additionally loaded data, if more institutions participate and improve the completeness of the survey responses, accurate results can be obtained on various topics. Conclusion This study has shown that a CDM could be used to study data from Workers’ Health examinations. However, it was found that there are parts that need to be supplemented in the current special health examination data. If we complement this and build distributed big data, we will be able to conduct better research. If a representative institution in each region of Korea uses this CDM to perform a study, it will be possible to learn about the characteristics of all the people who undergo Workers’ Health examinations in Korea.
서론 최근 우리나라 의약학 분야에서 데이터의 양이 많아지면서 Real world evidence (RWE) 라는 이름으로 real world data (RWD) 를 활용하는 연구가 유행처럼 시행되고 있다. 미국은 의료데이터를 활용하여 연구하기 위해 OMOP-CDM을 활용하고 있다. 우리나라에서도 OMOP-CDM을 활용하여 Medical device, 대장암, 약물연구 등 다양한 분야에서 연구를 진행하고 있다. 특수건강검진 데이터를 활용한 빅데이터 연구가 필요하지만, 각 기관의 데이터가 다른 형태를 지니고 있어서 그 동안은 연구가 진행되기에 어려움이 있었다. 따라서 본 연구는 특수건강검진 데이터를 CDM에 적용하여 연구에 활용할 수 있는지 알아보고자 하였다. 방법 3개의 대학병원에서 2015년 1월부터 2017년 12월까지의 데이터를 사용하였다. 각 기관의 EMR 현황에 맞게 data dictionary를 제작하고, data를 매핑하였다. data 매핑은 각 기관에 입력된 값을 data dictionary에 따라 동일한 범위의 값으로 변환하고, 변수명 또한 동일하도록 변환하였다. 이후 실증연구를 하였으며, 설문지를 활용한 연구와 유해인자 노출을 활용한 연구로 구분하여 진행하였다. 통계분석은 시행된 실증연구의 각 주제별 분석방법에 맞게 회귀분석, 생존분석, 메타분석 등을 시행하였다. 결과 각 기관별 설문지 부문에서 General Questionnaire, Night Questionnaire, Special Examination으로 나누어 매핑테이블을 제작하였다. General Questionnaire는 5개 영역으로 나누었고, 전체 변수는 85개였다. Night Questionnaire는 5개 영역으로 나누었고, 전체 변수는 64개였다. Special Examination는 11개 영역으로 나누었고, 전체 변수는 40개였다. 검사결과 부문에서는 신체계측, 임상검사로 나누어 매핑테이블을 제작하였다. 신체계측은 99개의 변수 였고, 임상검사는 231개의 변수였다. 설문을 활용한 실증연구에서는 연속야간근무일수와 불면증에 관한 연구와 교대 간격과 불면증에 관한 연구, 그리고, 불면증과 변비에 관한 연구를 시행하였다. 유해인자 노출을 활용한 실증연구에서는 소음 노출과 공복혈당에 관한 연구, 소음 노출과 고혈압에 관한 연구, 그리고, 먼지 노출과 당뇨에 관한 연구를 시행하였으며, 6개의 실증연구 관련 논문이 저널에 게재되었다. 고찰 특수건강검진 설문지는 공식적인 표준 코드가 없기 때문에 본 연구를 위해 모든 변수를 포함할 수 있도록 포괄적으로 데이터 사전을 만들고 새롭게 생성한 코드를 사용하였다. 설문지의 변수에서 세브란스 병원은 79.4%, 울산대학교 병원은 68.2%, 원주세브란스 병원은 66.1%를 매핑하였으며, 유해인자 노출 변수에서는 세브란스 병원은 76.1%, 울산대학교병원은 76.1%, 원주세브란스 병원은 37.0%를 매핑하였다. 다만, 원주세브란스 병원은 본 연구를 위해 제공된 변수가 적었기 때문에 매핑률(%)이 낮았으나, 추후 변수를 확장한다면, 해당 기관의 매핑률(%)은 높아질 것으로 기대된다. 설문지와 유해인자 노출을 활용한 연구에서 일부 부족한 정보에 대한 아쉬움은 있지만, CDM을 적용한 특수건강검진 data는 연구에 적용이 가능한 것으로 확인되었다. CDM 은 추가로 적재되는 데이터에서도 같은 방식으로 표준화가 이루어질 수 있기 때문에 더 많은 기관이 참여하고, 설문 응답의 완성도를 높인다면 다양한 주제로 정확한 결과를 얻을 수 있을 것이다. 결론 본 연구를 통해 특수건강검진 데이터는 CDM을 활용한 연구 가능성을 확인하였다. 다만, 현재의 특수건강검진 데이터에서 보완이 필요한 부분이 있음을 알 수 있었다. 이를 보완하여 분산형 빅데이터를 구축한다면 더 나은 연구를 할 수 있을 것이다. 또한, 우리나라 각 지역별 대표 의료기관과 함께 CDM을 적용한 연구를 시행한다면 우리나라 전체 특수건강검진 대상자의 특성에 대해 파악할 수 있을 것이다.

Files in This Item:: T015743.pdf Download

Appears in Collections:: 1. College of Medicine (의과대학) > Others (기타) > 3. Dissertation

URI: https://ir.ymlib.yonsei.ac.kr/handle/22282913/197017

사서에게 알리기

Show full item record Find it @ YMLIB

License

YUHSpace: Application and Evaluation of a Common Data Model to Multicenter Workers' Health Examination Data

YUHSpace

BROWSE

Browse

Links