Discovering new Gram-negative antibiotics has been a challenge for decades. This has been largely attributed to a limited understanding of the molecular descriptors governing Gram-negative permeation and efflux evasion. Herein, we address the contribution of efflux using a novel approach that applies multivariate analysis, machine learning, and structure-based clustering to some 4,500 molecules (actives) from a small-molecule screen in efflux-compromised
Escherichia coli. We employed principal-component analysis and trained two decision tree-based machine learning models to investigate descriptors contributing to the antibacterial activity and efflux susceptibility of these actives. This approach revealed that the Gram-negative activity of hydrophobic and planar small molecules with low molecular stability is limited to efflux-compromised E. coli. Furthermore, molecules with reduced branching and compactness showed increased susceptibility to efflux. Given these distinct properties that govern efflux, we developed the first efflux susceptibility machine learning model, called Susceptibility to Efflux Random Forest (SERF), as a tool to analyze the molecular descriptors of small molecules and predict those that could be susceptible to efflux pumps in silico. Here, SERF demonstrated high accuracy in identifying such molecules. Furthermore, we clustered all 4,500 actives based on their core structures and identified distinct clusters highlighting side-chain moieties that cause marked changes in efflux susceptibility. In all, our work reveals a role for physicochemical and structural parameters in governing efflux, presents a machine learning tool for rapid in silicoanalysis of efflux susceptibility, and provides a proof of principle for the potential of exploiting side-chain modification to design novel antimicrobials evading efflux pumps.