Cited 0 times in 
Cited 0 times in 
Success and failure of human-AI collaboration in clinical reasoning: An experimental study on challenging real-world cases
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Ong, Kai Tzu-iunn | - |
| dc.contributor.author | Seo, Junwon | - |
| dc.contributor.author | Kim, Hyojun | - |
| dc.contributor.author | Kim, Jiwoo | - |
| dc.contributor.author | Kim, Jihoon | - |
| dc.contributor.author | Kim, Sunghwan | - |
| dc.contributor.author | Yeo, Jinyoung | - |
| dc.contributor.author | Choi, Eun Young | - |
| dc.contributor.author | 최은영 | - |
| dc.date.accessioned | 2026-03-25T03:10:36Z | - |
| dc.date.available | 2026-03-25T03:10:36Z | - |
| dc.date.created | 2026-03-20 | - |
| dc.date.issued | 2026-05 | - |
| dc.identifier.issn | 1386-5056 | - |
| dc.identifier.uri | https://ir.ymlib.yonsei.ac.kr/handle/22282913/211452 | - |
| dc.description.abstract | Background: While conversational human-AI collaboration (HAC) using large language models (LLM) has shown potential to enhance clinical reasoning, its effectiveness in highly specialized and challenging clinical scenarios remains unclear. This study aimed to evaluate the effectiveness of HAC and analyzed the causes of its success and failure. Methods: A crossover experimental study was conducted using 30 challenging cases from JAMA Ophthalmology. Thirty participants (10 board-certified ophthalmologist, 10 ophthalmology resident, and 10 senior medical students) completed the cases under two conditions: independent work (human-only) and collaboration through free-text conversation with Claude-3.5-Sonnet (HAC). Performance accuracy, along with self-rated confidence and cognitive burden, were assessed. HAC interaction logs were analyzed to evaluate the appropriateness of the LLM's accepting and arguing behaviors, which were categorized into six patterns. Sliding paired t-tests across incremental thresholds were used to assess how accuracy gains from HAC varied by task difficulty. Results: HAC significantly improved mean accuracy compared to the human-only condition (from 0.45 to 0.60, P < 0.001), although 20% of participants showed a decline in performance and the mean remained below the LLMonly accuracy (0.70). HAC significantly increased confidence and reduced cognitive burden (both P < 0.001) in both successful and failed HAC. The appropriateness of LLM behaviors was substantially higher in successful HAC than in failed HAC (F1 score = 0.92 vs. 0.29, P < 0.001). In successful HAC, 92.6% followed the pattern LLM presents correct insight/human accepts, while 58.6% of failures involved LLM presents incorrect insight/human accepts. HAC improved accuracy significantly in tasks where the human-only correct response rate exceeded 47% (P < 0.05), but not below 30% (P >= 0.188). Conclusions: These findings suggest that HAC benefits complex clinical decisions in ophthalmology but remains limited by human, model, and task-level factors requiring further improvement. | - |
| dc.language | English | - |
| dc.publisher | Elsevier Science Ireland Ltd. | - |
| dc.relation.isPartOf | INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS | - |
| dc.relation.isPartOf | INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS | - |
| dc.subject.MESH | Adult | - |
| dc.subject.MESH | Clinical Reasoning* | - |
| dc.subject.MESH | Cooperative Behavior* | - |
| dc.subject.MESH | Cross-Over Studies | - |
| dc.subject.MESH | Female | - |
| dc.subject.MESH | Humans | - |
| dc.subject.MESH | Male | - |
| dc.title | Success and failure of human-AI collaboration in clinical reasoning: An experimental study on challenging real-world cases | - |
| dc.type | Article | - |
| dc.contributor.googleauthor | Ong, Kai Tzu-iunn | - |
| dc.contributor.googleauthor | Seo, Junwon | - |
| dc.contributor.googleauthor | Kim, Hyojun | - |
| dc.contributor.googleauthor | Kim, Jiwoo | - |
| dc.contributor.googleauthor | Kim, Jihoon | - |
| dc.contributor.googleauthor | Kim, Sunghwan | - |
| dc.contributor.googleauthor | Yeo, Jinyoung | - |
| dc.contributor.googleauthor | Choi, Eun Young | - |
| dc.identifier.doi | 10.1016/j.ijmedinf.2026.106342 | - |
| dc.relation.journalcode | J01129 | - |
| dc.identifier.eissn | 1872-8243 | - |
| dc.identifier.pmid | 41689881 | - |
| dc.subject.keyword | human-AI collaboration | - |
| dc.subject.keyword | Clinical reasoning | - |
| dc.subject.keyword | Ophthalmology | - |
| dc.subject.keyword | Large language model | - |
| dc.subject.keyword | Confidence | - |
| dc.subject.keyword | Cognitive burden | - |
| dc.subject.keyword | Model behaviors | - |
| dc.subject.keyword | Task difficulty | - |
| dc.contributor.affiliatedAuthor | Seo, Junwon | - |
| dc.contributor.affiliatedAuthor | Kim, Jiwoo | - |
| dc.contributor.affiliatedAuthor | Choi, Eun Young | - |
| dc.identifier.scopusid | 2-s2.0-105029904759 | - |
| dc.identifier.wosid | 001702499200001 | - |
| dc.citation.volume | 211 | - |
| dc.identifier.bibliographicCitation | INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, Vol.211, 2026-05 | - |
| dc.identifier.rimsid | 91992 | - |
| dc.type.rims | ART | - |
| dc.description.journalClass | 1 | - |
| dc.description.journalClass | 1 | - |
| dc.subject.keywordAuthor | human-AI collaboration | - |
| dc.subject.keywordAuthor | Clinical reasoning | - |
| dc.subject.keywordAuthor | Ophthalmology | - |
| dc.subject.keywordAuthor | Large language model | - |
| dc.subject.keywordAuthor | Confidence | - |
| dc.subject.keywordAuthor | Cognitive burden | - |
| dc.subject.keywordAuthor | Model behaviors | - |
| dc.subject.keywordAuthor | Task difficulty | - |
| dc.subject.keywordPlus | OVERCONFIDENCE | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
| dc.relation.journalWebOfScienceCategory | Health Care Sciences & Services | - |
| dc.relation.journalWebOfScienceCategory | Medical Informatics | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Health Care Sciences & Services | - |
| dc.relation.journalResearchArea | Medical Informatics | - |
| dc.identifier.articleno | 106342 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.