Integrative modeling of tumor genomes and epigenomes for enhanced cancer diagnosis by cell-free DNA
Authors
Mingyun Bae ; Gyuhee Kim ; Tae-Rim Lee ; Jin Mo Ahn ; Hyunwook Park ; Sook Ryun Park ; Ki Byung Song ; Eunsung Jun ; Dongryul Oh ; Jeong-Won Lee ; Young Sik Park ; Ki-Won Song ; Jeong-Sik Byeon ; Bo Hyun Kim ; Joo Hyuk Sohn ; Min Hwan Kim ; Gun Min Kim ; Eui Kyu Chie ; Hyun-Cheol Kang ; Sun-Young Kong ; Sang Myung Woo ; Jeong Eon Lee ; Jai Min Ryu ; Junnam Lee ; Dasom Kim ; Chang-Seok Ki ; Eun-Hae Cho ; Jung Kyoon Choi
Multi-cancer early detection remains a key challenge in cell-free DNA (cfDNA)-based liquid biopsy. Here, we perform cfDNA whole-genome sequencing to generate two test datasets covering 2125 patient samples of 9 cancer types and 1241 normal control samples, and also a reference dataset for background variant filtering based on 20,529 low-depth healthy samples. An external cfDNA dataset consisting of 208 cancer and 214 normal control samples is used for additional evaluation. Accuracy for cancer detection and tissue-of-origin localization is achieved using our algorithm, which incorporates cancer type-specific profiles of mutation distribution and chromatin organization in tumor tissues as model references. Our integrative model detects early-stage cancers, including those of pancreatic origin, with high sensitivity that is comparable to that of late-stage detection. Model interpretation reveals the contribution of cancer type-specific genomic and epigenomic features. Our methodologies may lay the groundwork for accurate cfDNA-based cancer diagnosis, especially at early stages.