Photo Credit: OG
The following is a summary of “Diagnostic performance of advanced large language models in cystoscopy: evidence from a retrospective study and clinical cases,” published in the March 2025 issue of BMC Urology by Guo et al.
Researchers conducted a retrospective study to assess large language models (LLMs) in diagnosing urological conditions from cystoscopy images.
They analyzed 603 cystoscopy images from 101 procedures, using 2 advanced LLMs for interpretation. Results were compared with standard clinical diagnostics, measuring overall and condition-specific accuracies.
The results showed a combined diagnostic accuracy of 89.2%, with ChatGPT-4 V at 82.8% and Claude 3.5 Sonnet at 79.8%. For bladder tumors, accuracies were 92.2% and 80.9%; BPH, 35.3% and 32.4%; cystitis, 94.5% and 98.9%; bladder diverticula, 92.3% and 53.8%; and bladder trabeculae, 55.8% and 59.6%. For normal structures, ureteral orifice accuracy was 48.8% and 61.0%; bladder neck, 97.9% and 93.8%; and prostatic urethra, 64.3% and 57.1%.
Investigators demonstrated varying diagnostic accuracy in cystoscopy interpretation, excelling in cystitis but showing lower accuracy for benign prostatic hyperplasia. They highlighted the potential of LLMs as supportive tools for early-career urologists and the need for further research to enhance AI-driven diagnostics.
Source: bmcurol.biomedcentral.com/articles/10.1186/s12894-025-01740-8
Create Post
Twitter/X Preview
Logout