A Unified Proteochemometric Model for Prediction of Inhibition of Cytochrome P450 Isoforms
Published: 2013-06-17
Formatted citation
Lapins M, Worachartcheewan A, Spjuth O, Georgiev V, Prachayasittikul V, Nantasenamat C, Wikberg JES.
A Unified Proteochemometric Model for Prediction of Inhibition of Cytochrome P450 Isoforms.
PLoS One.
8, 6, e66566. (2013).
DOI: 10.1371/journal.pone.0066566
Abstract
A unified proteochemometric (PCM) model for the prediction of the ability of drug-like chemicals to inhibit five major drug metabolizing CYP isoforms (i.e. CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4) was created and made publicly available under the Bioclipse Decision Support open source system at www.cyp450model.org. In regards to the proteochemometric modeling we represented the chemical compounds by molecular signature descriptors and the CYP-isoforms by alignment-independent description of composition and transition of amino acid properties of their protein primary sequences. The entire training dataset contained 63 391 interactions and the best PCM model was obtained using signature descriptors of height 1, 2 and 3 and inducing the model with a support vector machine. The model showed excellent predictive ability with internal AUC = 0.923 and an external AUC = 0.940, as evaluated on a large external dataset. The advantage of PCM models is their extensibility making it possible to extend our model for new CYP isoforms and polymorphic CYP forms. A key benefit of PCM is that all proteins are confined in one single model, which makes it generally more stable and predictive as compared with single target models. The inclusion of the model in Bioclipse Decision Support makes it possible to make virtual instantaneous predictions (∼100 ms per prediction) while interactively drawing or modifying chemical structures in the Bioclipse chemical structure editor.