7 128

Cited 0 times in

Using gut microbiome metagenomic hypervariable features for diabetes screening and typing through supervised machine learning

Authors
 Xavier Chavarria  ;  Hyun Seo Park  ;  Singeun Oh  ;  Dongjun Kang  ;  Jun Ho Choi  ;  Myungjun Kim  ;  Yoon Hee Cho  ;  Myung-Hee Yi  ;  Ju Yeong Kim 
Citation
 MICROBIAL GENOMICS, Vol.11(3) : 001365, 2025-03 
Journal Title
MICROBIAL GENOMICS
Issue Date
2025-03
MeSH
Adult ; Bacteria* / classification ; Bacteria* / genetics ; Diabetes Mellitus, Type 1* / diagnosis ; Diabetes Mellitus, Type 1* / microbiology ; Diabetes Mellitus, Type 2* / diagnosis ; Diabetes Mellitus, Type 2* / microbiology ; Female ; Gastrointestinal Microbiome* / genetics ; Humans ; Machine Learning ; Male ; Metagenome ; Metagenomics* / methods ; Middle Aged ; RNA, Ribosomal, 16S / genetics ; Supervised Machine Learning* ; Support Vector Machine
Keywords
diabetes mellitus ; gut microbiome ; metabarcoding ; microbial markers ; random forest ; supervised machine learning
Abstract
Diabetes mellitus is a complex metabolic disorder and one of the fastest-growing global public health concerns. The gut microbiota is implicated in the pathophysiology of various diseases, including diabetes. This study utilized 16S rRNA metagenomic data from a volunteer citizen science initiative to investigate microbial markers associated with diabetes status (positive or negative) and type (type 1 or type 2 diabetes mellitus) using supervised machine learning (ML) models. The diversity of the microbiome varied according to diabetes status and type. Differential microbial signatures between diabetes types and negative group revealed an increased presence of Brucellaceae, Ruminococcaceae, Clostridiaceae, Micrococcaceae, Barnesiellaceae and Fusobacteriaceae in subjects with diabetes type 1, and Veillonellaceae, Streptococcaceae and the order Gammaproteobacteria in subjects with diabetes type 2. The decision tree, elastic net, random forest (RF) and support vector machine with radial kernel ML algorithms were trained to screen and type diabetes based on microbial profiles of 76 subjects with type 1 diabetes, 366 subjects with type 2 diabetes and 250 subjects without diabetes. Using the 1000 most variable features, tree-based models were the highest-performing algorithms. The RF screening models achieved the best performance, with an average area under the receiver operating characteristic curve (AUC) of 0.76, although all models lacked sensitivity. Reducing the dataset to 500 features produced an AUC of 0.77 with sensitivity increasing by 74% from 0.46 to 0.80. Model performance improved for the classification of negative-status and type 2 diabetes. Diabetes type models performed best with 500 features, but the metric performed poorly across all model iterations. ML has the potential to facilitate early diagnosis of diabetes based on microbial profiles of the gut microbiome.
Files in This Item:
T202502763.pdf Download
DOI
10.1099/mgen.0.001365
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Tropica Medicine (열대의학교실) > 1. Journal Papers
Yonsei Authors
Kim, Ju Yeong(김주영) ORCID logo https://orcid.org/0000-0003-2456-6298
Yi, Myung Hee(이명희) ORCID logo https://orcid.org/0000-0001-9537-5726
Cho, Yoon Hee(조윤희)
Choi, Jun Ho(최준호) ORCID logo https://orcid.org/0000-0002-7416-3377
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/205924
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links