Susceptibility of Large Language Models to User-Driven Factors in Medical Queries

Lim, Kyung Ho; Kang, Ujin; Li, Xiang; Kim, Jin Sung; Jung, Young-Chul; Park, Sangjoon; Kim, Byung-Hoon

doi:10.1007/s41666-025-00218-4

YUHSpace

BROWSE

0 25

Cited 0 times in

Susceptibility of Large Language Models to User-Driven Factors in Medical Queries

DC Field	Value	Language
dc.contributor.author	Lim, Kyung Ho	-
dc.contributor.author	Kang, Ujin	-
dc.contributor.author	Li, Xiang	-
dc.contributor.author	Kim, Jin Sung	-
dc.contributor.author	Jung, Young-Chul	-
dc.contributor.author	Park, Sangjoon	-
dc.contributor.author	Kim, Byung-Hoon	-
dc.date.accessioned	2026-01-20T05:28:05Z	-
dc.date.available	2026-01-20T05:28:05Z	-
dc.date.created	2026-01-14	-
dc.date.issued	2025-12	-
dc.identifier.issn	2509-4971	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/210031	-
dc.description.abstract	Large language models (LLMs) are increasingly used in healthcare; however, their reliability is shaped not only by model design but also by how queries are phrased and how complete the information is. This study assesses how user-driven factors, including misinformation framing, source authority, model personas, and omission of critical clinical details, influence the diagnostic accuracy and reliability of LLM-generated medical responses. Utilizing two public datasets (MedQA and Medbullets), we conducted two tests: (1) perturbation-evaluating LLM persona (assistant vs. expert AI), misinformation source authority (inexperienced vs. expert), and tone (assertive vs. hedged); and (2) ablation-omission of key clinical data. Proprietary LLMs (GPT-4o (OpenAI), Claude-3<middle dot>5 Sonnet (Anthropic), Claude-3<middle dot>5 Haiku (Anthropic), Gemini-1<middle dot>5 Pro (Google), Gemini-1<middle dot>5 Flash (Google)) and open-source LLMs (LLaMA-3 8B, LLaMA-3 Med42 8B, DeepSeek-R1 8B) were used for evaluation. Results show that in the perturbation test, all LLMs were susceptible to user-driven misinformation, with an assertive tone exerting the strongest overall impact, while proprietary models were more vulnerable to strong or authoritative misinformation. In the ablation test, omitting physical examination findings and laboratory results caused the largest accuracy decline. Proprietary models achieved higher baseline accuracy but demonstrated sharper performance drops under biased or incomplete input. These findings highlight that structured prompts and a complete clinical context are essential for accurate responses. Users should avoid authoritative misinformation framing and provide a complete clinical context, especially for complex and challenging queries. By clarifying the impact of user-driven biases, this study contributes insights to the safe integration of LLMs into healthcare practice.	-
dc.language	영어	-
dc.publisher	SPRINGERNATURE	-
dc.relation.isPartOf	JOURNAL OF HEALTHCARE INFORMATICS RESEARCH	-
dc.title	Susceptibility of Large Language Models to User-Driven Factors in Medical Queries	-
dc.type	Article	-
dc.contributor.googleauthor	Lim, Kyung Ho	-
dc.contributor.googleauthor	Kang, Ujin	-
dc.contributor.googleauthor	Li, Xiang	-
dc.contributor.googleauthor	Kim, Jin Sung	-
dc.contributor.googleauthor	Jung, Young-Chul	-
dc.contributor.googleauthor	Park, Sangjoon	-
dc.contributor.googleauthor	Kim, Byung-Hoon	-
dc.identifier.doi	10.1007/s41666-025-00218-4	-
dc.identifier.url	https://link.springer.com/article/10.1007/s41666-025-00218-4	-
dc.subject.keyword	Large language model	-
dc.subject.keyword	Natural language processing	-
dc.subject.keyword	Artificial intelligence	-
dc.subject.keyword	Clinical decision support systems	-
dc.subject.keyword	Diagnostic errors	-
dc.subject.keyword	Bias	-
dc.contributor.affiliatedAuthor	Lim, Kyung Ho	-
dc.contributor.affiliatedAuthor	Kim, Jin Sung	-
dc.contributor.affiliatedAuthor	Jung, Young-Chul	-
dc.contributor.affiliatedAuthor	Park, Sangjoon	-
dc.contributor.affiliatedAuthor	Kim, Byung-Hoon	-
dc.identifier.scopusid	2-s2.0-105023537542	-
dc.identifier.wosid	001627903000001	-
dc.identifier.bibliographicCitation	JOURNAL OF HEALTHCARE INFORMATICS RESEARCH, 2025-12	-
dc.identifier.rimsid	90939	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-
dc.subject.keywordAuthor	Large language model	-
dc.subject.keywordAuthor	Natural language processing	-
dc.subject.keywordAuthor	Artificial intelligence	-
dc.subject.keywordAuthor	Clinical decision support systems	-
dc.subject.keywordAuthor	Diagnostic errors	-
dc.subject.keywordAuthor	Bias	-
dc.type.docType	Article; Early Access	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Health Care Sciences & Services	-
dc.relation.journalWebOfScienceCategory	Medical Informatics	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Health Care Sciences & Services	-
dc.relation.journalResearchArea	Medical Informatics	-

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Radiation Oncology (방사선종양학교실) > 1. Journal Papers
1. College of Medicine (의과대학) > Dept. of Psychiatry (정신과학교실) > 1. Journal Papers
1. College of Medicine (의과대학) > Dept. of Biomedical Systems Informatics (의생명시스템정보학교실) > 1. Journal Papers

Show simple item record Find it @ YMLIB

License

YUHSpace: Susceptibility of Large Language Models to User-Driven Factors in Medical Queries

YUHSpace

BROWSE

Browse

Links