GeneralAI / InformaticsResearchTrainee

DeepSeek-R1 edges out ChatGPT-o1 on simulated radiology board-style questions, but image-based ratings lag

Radiology AI literature (PubMed)3d ago

On 27 radiology questions, DeepSeek-R1 scored higher than ChatGPT-o1 (mean 4.51 vs 3.73 on a 5-point scale, P<.001). When ChatGPT-o1 answered image-based questions, residents rated it lower than its own text answers, particularly for factual accuracy (mean 2.75). Both models sho…

Read the source

RadPigeon summaries are original and for information only. They are not clinical advice.