AI/Machine Learning with confidence in Drug Discovery

This project aims at developing computational methods, tools and AI models to aid the drug discovery process. A special emphasis is on confidence in predicitons

Methods include ligand-based and structure-based methods such as QSAR (machine learning) and docking, with applications including prediction of drug safety, toxicology, interactions, target profiles and secondary pharmacology. In order to analyze large-scale data we make use of modern e-infrastructure such as high-performance computing clusters, cloud computing resources, containerized microservice environments such as Kubernetes, and data analytics platforms such as Apache Spark.

Figure: Data is extracted from various data sources, and we use high performance computing, cloud computing, workflows and big data frameworks to train predictive models which are deployed and served in microservice-environments via interoperable APIs and easy-to-use GUIs.

We also use and develop scientific workflow systems such as ScLuigi, SciPipe, and Pachyderm to automate and streamline analysis. The work is carried out in collaboration with AstraZeneca R&D and SweTox. We are strong promotors of open science and try to publish all data and models online.

Site-of-metabolism prediction

This project aims at developing methods for predicting site-of-metabolism and metabolites based on chemical structure. Using data mining techniques we have developed the tool MetaPrint2D for site-of-metabolism prediction. The project aims at improving these models and also to predict putative metabolites. The work is carried out in close collaboration with AstraZeneca R&D and models and tools are available from the Bioclipse workbench.

Figure: Prediction of site-of-metabolism with the MetaPrint2D method in Bioclipse.