Nonalcoholic fatty liver disease and early prediction of gestational diabetes mellitus using machine learning methods
Authors
Seung Mi Lee ; Suhyun Hwangbo ; Errol R Norwitz ; Ja Nam Koo ; Ig Hwan Oh ; Eun Saem Choi ; Young Mi Jung ; Sun Min Kim ; Byoung Jae Kim ; Sang Youn Kim ; Gyoung Min Kim ; Won Kim ; Sae Kyung Joo ; Sue Shin ; Chan-Wook Park ; Taesung Park ; Joong Shin Park
Citation
CLINICAL AND MOLECULAR HEPATOLOGY, Vol.28(1) : 105-116, 2022-01
Background/aims: To develop an early prediction model for gestational diabetes mellitus (GDM) using machine learning and to evaluate whether the inclusion of nonalcoholic fatty liver disease (NAFLD)-associated variables increases the performance of model.
Methods: This prospective cohort study evaluated pregnant women for NAFLD using ultrasound at 10-14 weeks and screened them for GDM at 24-28 weeks of gestation. The clinical variables before 14 weeks were used to develop prediction models for GDM (setting 1, conventional risk factors; setting 2, addition of new risk factors in recent guidelines; setting 3, addition of routine clinical variables; setting 4, addition of NALFD-associated variables, including the presence of NAFLD and laboratory results; and setting 5, top 11 variables identified from a stepwise variable selection method). The predictive models were constructed using machine learning methods, including logistic regression, random forest, support vector machine, and deep neural networks.
Results: Among 1,443 women, 86 (6.0%) were diagnosed with GDM. The highest performing prediction model among settings 1-4 was setting 4, which included both clinical and NAFLD-associated variables (area under the receiver operating characteristic curve [AUC] 0.563-0.697 in settings 1-3 vs. 0.740-0.781 in setting 4). Setting 5, with top 11 variables (which included NAFLD and hepatic steatosis index), showed similar predictive power to setting 4 (AUC 0.719-0.819 in setting 5, P=not significant between settings 4 and 5).
Conclusion: We developed an early prediction model for GDM using machine learning. The inclusion of NAFLDassociated variables significantly improved the performance of GDM prediction. (ClinicalTrials.gov Identifier: NCT02276144).