Anisha Nachnani, University School of Nashville, Nashville, TN
Introduction: Artificial intelligence (AI) has gained significant attention in the field of gastroenterology because of its potential to enhance diagnostic accuracy and improve patient outcomes. Among the AI systems developed, the clinical reasoning of ChatGPT, Gemini, and Bing have been tested using medical licensure exams. Little is known, however, about their competency on the gastroenterology board exam. The purpose of this study was to understand the performance of the most common AI platforms on gastroenterology board-exam style questions.
Methods: To assess their capabilities, a series of 50 gastroenterology board preparation questions from the American College of Gastroenterology Self-Assessment Program were presented to each AI system. The obtained responses were then evaluated and compared. Proportion of correct answers were compared using a Chi-square test.
Results: Fifty total questions were presented to the AI software. This included ten questions from each of the following categories: Colon, Endoscopy, Esophagus, Liver and Stomach. The accuracy of the AI systems was moderate, ranging from 60 to 74%. There was no difference in the proportion of correctly answered questions between the three AI platforms (ChatGPT 74%, Gemini 60 %, Bing 72%, p=0.26; Figure 1). ChatGPT had the highest accuracy, and numerically outperformed the other three platforms.
Discussion: While ChatGPT, Gemini, and Bing did demonstrate moderate accuracy, ChatGPT numerically outperformed Gemini and Bing when presented with board style clinical scenarios related to gastroenterology. Further investigation is warranted to ascertain the optimal integration of these AI systems into clinical practice and medical education.
Figure: Figure 1: Overall performance of the three artificial intelligence platforms, ChatGPT, Gemini, and Bing, on gastroenterology medicine board-style practice questions.
Disclosures:
Anisha Nachnani indicated no relevant financial relationships.
Anisha Nachnani, . P4913 - Use of Artificial Intelligence in Gastroenterology: A comparison of ChatGPT, Gemini and Bing systems, ACG 2024 Annual Scientific Meeting Abstracts. Philadelphia, PA: American College of Gastroenterology.