Modelling ligand selectivity of Serine Proteases using integrative Proteochemometric approaches improves model performance and allows the multi-target dependent interpretation of features
View / Open Files
Authors
Ain, Qurrat U
Méndez-Lucio, Oscar
Cortés, Ciriano Isidro
Malliavin, Thérèse
van, Westen Gerard JP
Publication Date
2014-09-16Journal Title
Integrative Biology
ISSN
1757-9694
Publisher
Royal Society of Chemistry
Volume
6
Pages
1023-1033
Language
English
Type
Article
Metadata
Show full item recordCitation
Ain, Q. U., Méndez-Lucio, O., Cortés, C. I., Malliavin, T., van, W. G. J., & Bender, A. (2014). Modelling ligand selectivity of Serine Proteases using integrative Proteochemometric approaches improves model performance and allows the multi-target dependent interpretation of features. Integrative Biology, 6 1023-1033. https://doi.org/10.1039/C4IB00175C
Abstract
Serine proteases, implicated in important physiological functions, have a high intra-family similarity, which leads to unwanted off-target effects of inhibitors with insufficient selectivity. However, the availability of sequence and structure data has now made it possible to develop approaches to design pharmacological agents that can discriminate successfully between their related binding sites. In this study, we have quantified the relationship between 12,625 distinct protease inhibitors and their bioactivity against 67 targets of the Serine Protease family (20,213 data points) in an integrative manner, using proteochemometric modelling (PCM). The benchmarking of 21 different target descriptors motivated the usage of specific binding pocket amino acid descriptors, which helped in the identification of active site residues and selective compound chemotypes affecting compound affinity and selectivity. PCM models performed better than alternative approaches (model trained using exclusively compound descriptors on all available data, QSAR) employed for comparison with R2/RMSE values of 0.64 ± 0.23/0.66 ± 0.20 vs. 0.35 ± 0.27/1.05 ± 0.27 log units, respectively. Moreover, the interpretation of the PCM model singled out various chemical substructures responsible for bioactivity and selectivity towards particular proteases (Thrombin, Trypsin and Coagulation Factor 10) in agreement with the literature. For instance, absence of primary sulphonamide was identified as responsible for decreased selective activity (by on average 0.27 ± 0.65 pChEMBL units) on FA10. Among the binding pocket residues, the amino acids (Arginine, Leucine and Tyrosine) at positions 35, 39, 60, 93, 140 and 207 were observed as key contributing residues for selective affinity on these three targets.
Sponsorship
Q.A. thanks the Islamic Development Bank and Cambridge Commonwealth Trust for Funding. O.M.L. is grateful to CONACyT (No. 217442/312933) and the Cambridge Overseas Trust for funding. G.v.W. thanks EMBL 90 (EIPOD) and Marie Curie (COFUND) for funding. A.B. thanks Unilever and the ERC (Starting Grant RC-2013-StG 336159 MIXTURE) for funding. ICC thanks the Institut Pasteur and the Pasteur-Paris International PhD programme for funding. TM thanks the Institut Pasteur for funding.
Funder references
European Research Council (336159)
Identifiers
External DOI: https://doi.org/10.1039/C4IB00175C
This record's URL: https://www.repository.cam.ac.uk/handle/1810/246000
Rights
Attribution 2.0 UK: England & Wales
Licence URL: http://creativecommons.org/licenses/by/2.0/uk/