ABSTRACT Background Systematic reviews and meta-analyses are essential for informed research and policymaking, yet they are typically resource-intensive and time-consuming. Recent advances in artificial intelligence and machine learning offer promising opportunities to streamline these processes. Objective To enhance the efficiency of systematic reviews, we explored the automation of various stages using GPT-3.5 Turbo. We assessed the model’s efficacy and performance by comparing it against three expert-conducted reviews across a comprehensive dataset of 24,534 studies. Methods The model’s performance was evaluated through a comparison with three expert reviews, utilizing a pseudo-K-folds permutation and a one-tailed ANOVA with an alpha level of 0.05 to ensure statistical validity. Key performance metrics such as accuracy, sensitivity, specificity, predictive values, F1-score, and the Matthews correlation coefficient were analyzed using two sets of prompts. Results Our approach significantly streamlined the systematic review process, which typically takes a year, reducing it to a few hours without sacrificing quality. In the initial screening phase, accuracy, specificity, and negative predictive values ranged between 80% and 95%. Sensitivity improved markedly during the second screening phase, demonstrating the model’s robustness when provided with more extensive data. Conclusion While ongoing refinements are needed, this tool represents a significant advancement in research methodologies, potentially making systematic reviews more accessible to a wider range of researchers. Impact Statement Our manuscript presents a novel review screening protocol built using open-source frameworks, which significantly enhances the systematic review process in terms of efficiency and cost-effectiveness. Leveraging the capabilities of GPT and embedding models, our protocol demonstrates the potential to transform a traditionally time-consuming and expensive task into an accelerated and economical operation, all while maintaining high standards of accuracy and reliability. Key Points GPT screening can streamline systematic reviews from a year-long, expensive process to just hours at minimal cost. Validated across different topics, the protocol exhibits high reliability and consistency in study inclusion. The AI-driven process reduces human bias, with prompt optimization considerably improving sensitivity.