Artificial intelligence in gastroenterology clinical practice: Scoping review of large language model applications


Yazarkan Y., Sonmez G., ŞİMŞEK C.

International Journal of Medical Informatics, vol.214, 2026 (SCI-Expanded, Scopus) identifier identifier

  • Publication Type: Article / Review
  • Volume: 214
  • Publication Date: 2026
  • Doi Number: 10.1016/j.ijmedinf.2026.106413
  • Journal Name: International Journal of Medical Informatics
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, CINAHL, Compendex, EMBASE, INSPEC, MEDLINE
  • Keywords: Artificial Intelligence, Clinical Decision Support, Gastroenterology, Large Language Models, Patient Education
  • Hacettepe University Affiliated: Yes

Abstract

Background: Advances in artificial intelligence have brought renewed attention to tools that can work with the large amount of written information generated in clinical practice. Among these, large language models (LLMs) stand out for their ability to interpret and generate medical text in a flexible, context-aware way. Gastroenterology, like many specialties, produces a wide range of narrative and semi-structured data, and this has encouraged researchers to explore how LLMs might help clinicians manage everyday tasks. Recent studies have examined their potential contributions to patient education, communication between care teams and patients, decision support, and routine documentation, reflecting a growing interest in how these systems might fit into real-world clinical workflows. Objective: This scoping review aimed to map current applications of LLMs in gastroenterology clinical practice, including subspecialty focus, study designs, model types, and reported outcomes. Method: Following PRISMA guidelines, a systematic search was conducted in PubMed, Scopus, and Web of Science for studies published between January 2022 and August 2025. Eligible studies included original research assessing LLM applications in gastroenterology clinical practice. Data were extracted on subspecialty, application domain, LLM type, data source, and outcomes. We employed thematic analysis to address our primary research question. Results: 73 out of 2895 studies identified in the initial search met the inclusion criteria. Six subspecialties and six application domains emerged from our review. Hepatology (20/73 studies, 27.3%) and endoscopy (17/73 studies, 23.2%) were the most represented subspecialties. The most frequently investigated application domains were patient education and communication (38 studies) and decision support and clinical guidance (24 studies). Most studies were simulation-based or literature-based cases, although an increasing number have used real-world clinical data, particularly in recent years. The majority evaluated general-purpose models such as GPT-3.5 and GPT-4, with some incorporating retrieval augmentation or fine-tuning. Reported outcomes varied by application domain and included measures of accuracy, concordance, completeness, relevance, safety, reliability, usability, user satisfaction, efficiency, time savings, and educational value. Commonly described limitations included variable reliability, incomplete responses, and challenges in generalizing from simulated to clinical settings. Conclusions: Research on LLMs in gastroenterology has expanded across multiple subspecialties and application domains. Current evidence is primarily based on simulation studies, with limited but growing evaluation using real-world clinical data. Further work is needed to assess performance in prospective and applied clinical contexts.