Can artificial intelligence teach histology? A pilot study comparing multimodal large language models in organ identification and histological feature recognition


Çam S. B., Öyken F. A.

2nd International Microscopy and Spectroscopy Congress, İstanbul, Turkey, 25 - 27 September 2025, pp.127, (Summary Text)

  • Publication Type: Conference Paper / Summary Text
  • City: İstanbul
  • Country: Turkey
  • Page Numbers: pp.127
  • Hacettepe University Affiliated: Yes

Abstract

INTRODUCTION: Artificial intelligence is reshaping medical education, and multimodal large language models (MLLMs)

offer strong potential by combining text and image analysis. Their integration makes them suitable for histology teaching,

where reliable recognition of tissues and microscopic structures is vital. However, their effectiveness in medical histology

education remains unclear.

AIM: This pilot study examines the accuracy of leading MLLMs in histological interpretation tasks within preclinical curricula.

METHODS: Sixty micrographs of healthy rat kidney, liver, thyroid, esophagus, and trachea (n=12 each) were selected from

MMO-Net database (1). Regions of interest were captured using QuPath, and images were submitted to Gemini Pro 2.5,

GPT-5, and Claude 4 Sonnet with standardized bilingual prompts: organ identification (prompt 1) and histological feature

description (prompt 2). Responses were recorded and analyzed.

RESULTS: For organ identification, Gemini outperformed GPT and Claude in English (p = 0.0133 and p < 0.0001) and

surpassed Claude in Turkish (p = 0.0421). For histological description, Gemini demonstrated superior accuracy in both

English (p = 0.02 and p < 0.0001) and Turkish (p = 0.0133 and p < 0.0001). All models performed better on parenchymal

versus luminal organs, which showed more variable success rates. Claude’s performance declined significantly when

moving from simple identification to the more complex description task in Turkish and English ( p < 0.0001 and p = 0.0004).

GPT showed a similar decline in English ( p = 0.0241).

CONCLUSION: Medical students mainly need reliable tissue identification and feature recognition. Our findings reveal

inconsistent MLLM performance, varying by organ type and task complexity, emphasizing the need for specialized,

validated AI models in histology education.

REFERENCES: Serna, C. G. et. al. 2022. MMO-Net (Multi-Magnification Organ Network): A use case for Organ Identification using Multiple

Magnifications in Preclinical Pathology Studies. Journal of Pathology Informatics, 13, 100126.