52 113

Cited 0 times in

Machine Learning Model for Predicting Postoperative Survival of Patients with Colorectal Cancer

 Mohamed Hosny Osman  ;  Reham Hosny Mohamed  ;  Hossam Mohamed Sarhan  ;  Eun Jung Park  ;  Seung Hyuk Baik  ;  Kang Young Lee  ;  Jeonghyun Kang 
 CANCER RESEARCH AND TREATMENT, Vol.54(2) : 517-524, 2022-04 
Journal Title
Issue Date
Colorectal Neoplasms* / pathology ; Humans ; Machine Learning* ; Predictive Value of Tests ; ROC Curve ; Survival Rate
Area under the curve ; Colorectal neoplasms ; LightGBM ; Machine learning ; Mortality ; SEER
Purpose: Machine learning (ML) is a strong candidate for making accurate predictions, as we can use large amount of data with powerful computational algorithms. We developed a ML based model to predict survival of patients with colorectal cancer (CRC) using data from two independent datasets.

Materials and methods: A total of 364,316 and 1,572 CRC patients were included from the Surveillance, Epidemiology, and End Results (SEER) and a Korean dataset, respectively. As SEER combines data from 18 cancer registries, internal validation was done using 18-Fold-Cross-Validation then external validation was performed by testing the trained model on the Korean dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity and positive predictive values.

Results: Clinicopathological characteristics were significantly different between the two datasets and the SEER showed a significant lower 5-year survival rate compared to the Korean dataset (60.1% vs. 75.3%, p < 0.001). The ML-based model using the Light gradient boosting algorithm achieved a better performance in predicting 5-year-survival compared to American Joint Committee on Cancer stage (AUROC, 0.804 vs. 0.736; p < 0.001). The most important features which influenced model performance were age, number of examined lymph nodes, and tumor size. Sensitivity and positive predictive values of predicting 5-year-survival for classes including dead or alive were reported as 68.14%, 77.51% and 49.88%, 88.1% respectively in the validation set. Survival probability can be checked using the web-based survival predictor (http://colorectalcancer.pythonanywhere.com).

Conclusion: ML-based model achieved a much better performance compared to staging in individualized estimation of survival of patients with CRC.
Files in This Item:
T202201436.pdf Download
Appears in Collections:
1. College of Medicine (의과대학) > Dept. of Surgery (외과학교실) > 1. Journal Papers
Yonsei Authors
Kang, Jeonghyun(강정현) ORCID logo https://orcid.org/0000-0001-7311-6053
Park, Eun Jung(박은정) ORCID logo https://orcid.org/0000-0002-4559-2690
Baik, Seung Hyuk(백승혁) ORCID logo https://orcid.org/0000-0003-4183-2332
Lee, Kang Young(이강영)
사서에게 알리기


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.