GeneralAI / InformaticsResearch

AI Accuracy Varies by Question Type in Oral and Maxillofacial Radiology Exams, Gemini Leads Factual Recall

Radiology AI literature (PubMed)3d ago

In 258 dental exam questions, AI models varied: Gemini led factual recall (p=0.048), but no model excelled in analytical reasoning (p=0.032). AI aids education, not yet reliable for clinical decisions in oral radiology.

Cross-sectional study of 258 multiple-choice questions (202 knowledge-based, 56 analytical) from the Turkish Dental Specialty Examination (2012–2021).
Gemini 3 Flash had highest accuracy on knowledge-based questions, Claude Sonnet 4.5 the lowest; no pairwise superiority in analytical questions.
Limitation: Evaluation on a static, single-country exam question set; generalizability to real-world clinical reasoning untested.

Read the source

RadPigeon summaries are original and for information only. They are not clinical advice.