The following is a summary of “Artificial Intelligence–enabled Decision Support in Surgery,” published in the June 2023 issue of Surgery by Loftus et al.
To provide a concise overview of the latest advancements in artificial intelligence-based decision support systems in surgery and to assess any shortcomings in scientific rigor and reporting. To positively impact surgical care, decision-support models must surpass the existing reporting guideline requirements by conducting external and real-time validation, enrolling sufficient sample sizes, reporting the precision of the model, evaluating performance across vulnerable populations, and achieving clinical implementation.
The extent to which published models fulfill these criteria is still being determined. The Embase, PubMed, and MEDLINE databases were systematically queried from their inception to September 21, 2022, to identify articles that elucidate the utilization of artificial intelligence–enabled decision support in the field of surgery.
These studies specifically focus on using preoperative or intraoperative data elements to forecast the occurrence of complications within a 90-day postoperative period. The evaluation and documentation of scientific rigor and reporting criteria were conducted following the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews.
The study included a range of sample sizes, varying from 163 to 2,882,526 participants. Of the 36 articles reviewed, 8 (22.2%) had sample sizes below 2000. Among these 8 articles, 7 (87.5%) had below-average area under the receiver operating characteristic or accuracy scores, which were less than 0.83. In total, 29 articles (80.6%) conducted internal validation exclusively, while 5 pieces (13.8%) focused on external validation, and 2 articles (5.6%) conducted real-time validation. About 23 articles (63.9%) reported the measure of precision.
No medical articles reported performance across sociodemographic categories. Thirteen articles (36.1%) presented a medical framework that could be utilized for clinical implementation; however, none evaluated the efficacy of clinical performance. The utilization of artificial intelligence in surgical decision support is constrained by its dependence on internal validation, insufficiently large sample sizes that increase the risk of overfitting and compromising predictive accuracy, and the omission of reporting confidence intervals, precision, equity analyses, and clinical implementation. Medical researchers should endeavor to enhance the quality of scientific studies.