GeneralAI / InformaticsNews
Human readers deserve ROC curves too: a fix for biased AI comparisons
Lauren Oakden-RaynerDec 8
Human diagnostic performance is systematically underestimated in medical AI studies because they compare a single sensitivity/specificity point against AI's full ROC curve. A simple proposed fix: obtain multiple confidence ratings from human readers to reconstruct their ROC curv…
- Many AI studies compare the area under the AI's ROC curve to a single human operating point, inflating AI's apparent superiority.
- The proposed method asks human readers to provide confidence ratings (e.g., definitely normal to definitely abnormal), allowing construction of a human ROC curve.
- This is an opinion piece; the approach has not been prospectively validated or widely adopted, and may add complexity to study design.
RadPigeon summaries are original and for information only. They are not clinical advice.
