Predicting protein network topology clusters from chemical structure using deep learning
Published: 2022-07-15
Formatted citation
Sreenivasan AP, Harrison PJ, Schaal W, Matuszewski DJ, Kultima K, and Spjuth O..
Predicting protein network topology clusters from chemical structure using deep learning.
Journal of Cheminformatics.
14, 47 (2022).
DOI: 10.1186/s13321-022-00622-7
Abstract
Comparing chemical structures to infer protein targets and functions is a common approach, but basing comparisons on chemical similarity alone can be misleading. Here we present a methodology for predicting target protein clusters using deep neural networks. The model is trained on clusters of compounds based on similarities calculated from combined compound-protein and protein-protein interaction data using a network topology approach. We compare several deep learning architectures including both convolutional and recurrent neural networks. The best performing method, the recurrent neural network architecture MolPMoFiT, achieved an F1 score approaching 0.9 on a held-out test set of 8907 compounds. In addition, in-depth analysis on a set of eleven well-studied chemical compounds with known functions showed that predictions were justifiable for all but one of the chemicals. Four of the compounds, similar in their molecular structure but with dissimilarities in their function, revealed advantages of our method compared to using chemical similarity.