Information Extraction from Clinical Texts with Generative Pre-trained Transformer Models

Min-Soo Kim; Philip Chung; Nima Aghaeepour; Namo Kim

doi:10.7150/ijms.103332

YUHSpace

BROWSE

16 500

Cited 0 times in

Cited 1 times in

Information Extraction from Clinical Texts with Generative Pre-trained Transformer Models

DC Field	Value	Language
dc.contributor.author	김남오	-
dc.contributor.author	김민수	-
dc.date.accessioned	2025-05-02T00:10:43Z	-
dc.date.available	2025-05-02T00:10:43Z	-
dc.date.issued	2025-02	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/205303	-
dc.description.abstract	Purpose: Processing and analyzing clinical texts are challenging due to its unstructured nature. This study compared the performance of GPT (Generative Pre-trained Transformer)-3.5 and GPT-4 for extracting information from clinical text. Materials and Methods: Three types of clinical texts, containing patient characteristics, medical history, and clinical test results extracted from case reports in open-access journals were utilized as input. Simple prompts containing queries for information extraction were then applied to both models using the Greedy Approach as the decoding strategy. When GPT models underperformed in certain tasks, we applied alternative decoding strategies or incorporated prompts with task-specific definitions. The outputs generated by GPT models were evaluated as True or False to determine the accuracy of information extraction. Results: Clinical texts containing patient characteristics (60 texts), medical history (50 texts), and clinical test results (25 texts) were extracted from 60 case reports. GPT models could extract information accurately with simple prompts to extract straightforward information from clinical texts. Regarding sex, GPT-4 demonstrated a significantly higher accuracy rate (95%) compared to GPT-3.5 (70%). GPT-3.5 (78%) outperformed GPT-4 (57%) in extracting body mass index (BMI). Utilizing alternative decoding strategies to sex and BMI did not practically improve the performance of the two models. In GPT-4, the revised prompts, including definitions of each sex category or the BMI formula, rectified all incorrect responses regarding sex and BMI generated during the main workflow. Conclusion: GPT models could perform adequately with simple prompts for extracting straightforward information. For complex tasks, incorporating task-specific definitions into the prompts is a suitable strategy than relying solely on simple prompts. Therefore, researchers and clinicians should use their expertise to create effective prompts and monitor LLM outcomes when extracting complex information from clinical texts.	-
dc.description.statementOfResponsibility	open	-
dc.language	English	-
dc.publisher	Ivyspring International Publisher	-
dc.relation.isPartOf	INTERNATIONAL JOURNAL OF MEDICAL SCIENCES	-
dc.rights	CC BY-NC-ND 2.0 KR	-
dc.subject.MESH	Data Mining* / methods	-
dc.subject.MESH	Female	-
dc.subject.MESH	Humans	-
dc.subject.MESH	Male	-
dc.subject.MESH	Natural Language Processing	-
dc.title	Information Extraction from Clinical Texts with Generative Pre-trained Transformer Models	-
dc.type	Article	-
dc.contributor.college	College of Medicine (의과대학)	-
dc.contributor.department	Dept. of Anesthesiology and Pain Medicine (마취통증의학교실)	-
dc.contributor.googleauthor	Min-Soo Kim	-
dc.contributor.googleauthor	Philip Chung	-
dc.contributor.googleauthor	Nima Aghaeepour	-
dc.contributor.googleauthor	Namo Kim	-
dc.identifier.doi	10.7150/ijms.103332	-
dc.contributor.localId	A00356	-
dc.contributor.localId	A00463	-
dc.relation.journalcode	J02917	-
dc.identifier.eissn	1449-1907	-
dc.identifier.pmid	40027192	-
dc.subject.keyword	Access to Information	-
dc.subject.keyword	Medical Informatics.	-
dc.subject.keyword	Medical Records	-
dc.subject.keyword	Natural Language Processing	-
dc.contributor.alternativeName	Kim, Namo	-
dc.contributor.affiliatedAuthor	김남오	-
dc.contributor.affiliatedAuthor	김민수	-
dc.citation.volume	22	-
dc.citation.number	5	-
dc.citation.startPage	1015	-
dc.citation.endPage	1028	-
dc.identifier.bibliographicCitation	INTERNATIONAL JOURNAL OF MEDICAL SCIENCES, Vol.22(5) : 1015-1028, 2025-02	-

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Anesthesiology and Pain Medicine (마취통증의학교실) > 1. Journal Papers

Show simple item record Find it @ YMLIB

License

YUHSpace: Information Extraction from Clinical Texts with Generative Pre-trained Transformer Models

YUHSpace

BROWSE

Browse

Links