Venous thromboembolism (VTE) is a leading cause of preventable in-hospital mortality. Monitoring VTE cases is limited by the challenges of manual chart review and diagnosis code interpretation. Natural language processing (NLP) can automate the process. Rule-based NLP methods are effective but time consuming. Machine learning (ML)-NLP methods present a promising solution. We conducted a systematic review and meta-analysis of studies published before May 2023 that use ML-NLP to identify VTE diagnoses in the electronic health records. Four reviewers screened all manuscripts, excluding studies that only used a rule-based method. A meta-analysis evaluated the pooled performance of each study’s best performing model that evaluated for pulmonary embolism (PE) and/or deep vein thrombosis (DVT). Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) with confidence interval (CI) were calculated by DerSimonian and Laird method using a random-effects model. Study quality was assessed using an adapted TRIPOD tool. Thirteen studies were included in the systematic review and 8 had data available for meta-analysis. Pooled sensitivity was 0.931 (95% CI 0.881-0.962), specificity 0.984 (95% CI 0.967-0.992), PPV 0.910 (95% CI 0.865-0.941) and NPV 0.985 (95% CI 0.977-0.990). All studies met at least 13 of the 21 NLP-modified TRIPOD items, demonstrating fair quality. The highest performing models used vectorization rather than bag-of-words, and deep learning techniques such as convolutional neural networks. There was significant heterogeneity in the studies and only four validated their model on an external dataset. Further standardization of ML studies can help progress this novel technology towards real-world implementation.Copyright © 2024 American Society of Hematology.