abstract
- Parasitic worms are significant causes of human and livestock disease. The battle against infections caused by parasitic worms involves the exploration of numerous potential drug candidates. One approach in screening for new drug candidates is using natural product extracts on the nematode C. elegans as a model organism. A critical step in this process is the examination of microscopy images of C. elegans after exposure to natural product extracts. Automatic image classification accelerates the analysis process compared to purely visual identification by an expert. We report a new C. elegans image dataset includes 12,717 microscopy images corresponding to natural product extracts, with about one-third of the images labeled by an expert and the remaining unlabeled. We make this dataset available to researchers for further development. We also propose a two-stage Semi-supervised Mix-up Barlow Twins Nematode Classifier (MBT-NC) to solve three image classification tasks involving nematode phenotypes after exposure to the studied natural extracts. MBT-NC combines self-supervised learning (SSL) for the feature representation stage (MBT) with a supervised classification stage (NC). In MBT, we utilize augmented and linearly interpolated samples for information maximization. Our method outperforms fully supervised and also other self-supervised methods on all three classification tasks: For binary, six-class, and 27-class classification, we outperform by 3.2%, 1.0%, and 2.2% respectively on test accuracy compared to the other methods. This is a new line of research in computer vision applications in healthcare. We have made this data public and users can obtain access through a simple request https://docs.google.com/forms/d/e/1FAIpQLSc0kb3mbMvfrLEAhBAoMbbbNkvNyf1Qf7nyOCSHfTqs0eEb3w/viewform.