1 3

Cited 0 times in

Cited 0 times in

SMOTE-augmented machine learning model predicts recurrent and metastatic breast cancer from microbiome analysis

Authors
 Hong, Ji Eun  ;  Kim, Yeon Eun  ;  Kang, Yun Soo  ;  Choi, Dong Hyeok  ;  Ahn, So Hyun  ;  An, Jeongshin 
Citation
 SCIENTIFIC REPORTS, Vol.15(1), 2025-09 
Article Number
 33096 
Journal Title
SCIENTIFIC REPORTS
Issue Date
2025-09
MeSH
Adult ; Aged ; Breast Neoplasms* / blood ; Breast Neoplasms* / diagnosis ; Breast Neoplasms* / microbiology ; Breast Neoplasms* / pathology ; Female ; Humans ; Machine Learning* ; Microbiota* / genetics ; Middle Aged ; Neoplasm Metastasis ; Neoplasm Recurrence, Local* / microbiology ; Prognosis ; RNA, Ribosomal, 16S / genetics ; ROC Curve ; Retrospective Studies
Keywords
Breast cancer ; Recurrence ; Metastasis ; Microbiome ; Machine learning
Abstract
Recurrence and metastasis of breast cancer (RMBC) have a decisive impact on patient survival, necessitating reliable biomarkers for its early prediction. This study used machine learning to evaluate blood microbiome profiles as predictive biomarkers of RMBC. A retrospective predictive analysis was conducted on 288 participants, including 96 patients with breast cancer and 192 healthy controls. After 7 years of follow-up, patients were classified into disease-free survival (DFS, n = 88) and RMBC (n = 8) groups. Blood microbiome composition was analysed using 16S rRNA sequencing, followed by quality control. The Synthetic Minority Oversampling Technique (SMOTE) was employed to address class imbalance. Eleven machine learning models were trained using leave-one-out cross-validation (LOOCV) and k-fold cross-validation, and evaluated based on the area under the receiver operating characteristic curve (AUROC), recall, precision, F1-score, and Matthews correlation coefficient (MCC). Alpha diversity was significantly lower in DFS and RMBC groups than in the healthy control group (p < 0.05), while beta diversity analysis revealed distinct clustering. The random forest achieved an AUROC of 0.94, a recall of 0.81, an F1-score of 0.83, and an MCC of 0.88. Enterobacter, Bacteroides, Klebsiella, and Bifidobacterium were among the key microbial genera predicting RMBC in the top five models. Blood microbiome profiling shows potential as a non-invasive RMBC biomarker. Machine learning effectively distinguished RMBC, warranting further validation.
Full Text
Adult ; Aged ; Breast Neoplasms* / blood ; Breast Neoplasms* / diagnosis ; Breast Neoplasms* / microbiology ; Breast Neoplasms* / pathology ; Female ; Humans ; Machine Learning* ; Microbiota* / genetics ; Middle Aged ; Neoplasm Metastasis ; Neoplasm Recurrence, Local* / microbiology ; Prognosis ; RNA, Ribosomal, 16S / genetics ; ROC Curve ; Retrospective Studies
Files in This Item:
90719.pdf Download
DOI
10.1038/s41598-025-16790-z
Appears in Collections:
1. College of Medicine (의과대학) > Others (기타) > 1. Journal Papers
URI
https://ir.ymlib.yonsei.ac.kr/handle/22282913/209796
사서에게 알리기
  feedback

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse

Links