Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes

Yang, In Seok; Bae, Sang Won; Park, BeumJin; Kim, Sangwoo

doi:10.1371/journal.pone.0246354

YUHSpace

BROWSE

323 602

Cited 1 times in

Cited 0 times in

Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes

DC Field	Value	Language
dc.contributor.author	Yang, In Seok	-
dc.contributor.author	Bae, Sang Won	-
dc.contributor.author	Park, BeumJin	-
dc.contributor.author	Kim, Sangwoo	-
dc.date.accessioned	2021-12-28T17:29:29Z	-
dc.date.available	2021-12-28T17:29:29Z	-
dc.date.created	2021-08-25	-
dc.date.issued	2021-02	-
dc.identifier.issn	1932-6203	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/187125	-
dc.description.abstract	Short DNA oligonucleotides (similar to 4 mer) have been used to index samples from different sources, such as in multiplex sequencing. Presently, longer oligonucleotides (8-12 mer) are being used as molecular barcodes with which to distinguish among raw DNA molecules in many high-tech sequence analyses, including low-frequent mutation detection, quantitative transcriptome analysis, and single-cell sequencing. Despite some advantages of using molecular barcodes with random sequences, such an approach, however, makes it impossible to know the exact sequences used in an experiment and can lead to inaccurate interpretation due to misclustering of barcodes arising from the occurrence of unexpected mutations in the barcodes. The present study introduces a tool developed for selecting an optimal barcode subset during molecular barcoding. The program considers five barcode factors: GC content, homopolymers, simple sequence repeats with repeated units of dinucleotides, Hamming distance, and complementarity between barcodes. To evaluate a selected barcode set, penalty scores for the factors are defined based on their distributions observed in random barcodes. The algorithm employed in the program comprises two steps: i) random generation of an initial set and ii) optimal barcode selection via iterative replacement. Users can execute the program by inputting barcode length and the number of barcodes to be generated. Furthermore, the program accepts a user's own values for other parameters, including penalty scores, for advanced use, allowing it to be applied in various conditions. In many test runs to obtain 100000 barcodes with lengths of 12 nucleotides, the program showed fast performance, efficient enough to generate optimal barcode sequences with merely the use of a desktop PC. We also showed that VFOS has comparable performance, flexibility in program running, consideration of simple sequence repeats, and fast computation time in comparison with other two tools (DNABarcodes and FreeBarcodes). Owing to the versatility and fast performance of the program, we expect that many researchers will opt to apply it for selecting optimal barcode sets during their experiments, including next-generation sequencing.	-
dc.description.statementOfResponsibility	open	-
dc.language	English	-
dc.publisher	Public Library of Science	-
dc.relation.isPartOf	PLOS ONE	-
dc.relation.isPartOf	PLOS ONE	-
dc.rights	CC BY-NC-ND 2.0 KR	-
dc.title	Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes	-
dc.type	Article	-
dc.contributor.college	College of Medicine (의과대학)	-
dc.contributor.department	Dept. of Biomedical Systems Informatics (의생명시스템정보학교실)	-
dc.contributor.googleauthor	Yang, In Seok	-
dc.contributor.googleauthor	Bae, Sang Won	-
dc.contributor.googleauthor	Park, BeumJin	-
dc.contributor.googleauthor	Kim, Sangwoo	-
dc.identifier.doi	10.1371/journal.pone.0246354	-
dc.relation.journalcode	J02540	-
dc.identifier.eissn	1932-6203	-
dc.contributor.alternativeName	Kim, Sang Woo	-
dc.contributor.affiliatedAuthor	Yang, In Seok	-
dc.contributor.affiliatedAuthor	Park, BeumJin	-
dc.contributor.affiliatedAuthor	Kim, Sangwoo	-
dc.identifier.scopusid	2-s2.0-85101375640	-
dc.identifier.wosid	000620625100038	-
dc.citation.volume	16	-
dc.citation.number	2	-
dc.identifier.bibliographicCitation	PLOS ONE, Vol.16(2), 2021-02	-
dc.identifier.rimsid	71273	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalWebOfScienceCategory	Multidisciplinary Sciences	-
dc.relation.journalResearchArea	Science & Technology - Other Topics	-
dc.identifier.articleno	e0246354	-

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers

Show simple item record Find it @ YMLIB

License

YUHSpace: Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes

YUHSpace

BROWSE

Browse

Links