Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs.
View / Open Files
Authors
Patil, Kaustubh Raosaheb
Publication Date
2022-04Journal Title
PLoS Comput Biol
ISSN
1553-734X
Publisher
Public Library of Science (PLoS)
Volume
18
Issue
4
Language
eng
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Periwal, V., Bassler, S., Andrejev, S., Gabrielli, N., Patil, K. R., Typas, A., & Patil, K. R. (2022). Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs.. PLoS Comput Biol, 18 (4) https://doi.org/10.1371/journal.pcbi.1010029
Abstract
Natural compounds constitute a rich resource of potential small molecule therapeutics. While experimental access to this resource is limited due to its vast diversity and difficulties in systematic purification, computational assessment of structural similarity with known therapeutic molecules offers a scalable approach. Here, we assessed functional similarity between natural compounds and approved drugs by combining multiple chemical similarity metrics and physicochemical properties using a machine-learning approach. We computed pairwise similarities between 1410 drugs for training classification models and used the drugs shared protein targets as class labels. The best performing models were random forest which gave an average area under the ROC of 0.9, Matthews correlation coefficient of 0.35, and F1 score of 0.33, suggesting that it captured the structure-activity relation well. The models were then used to predict protein targets of circa 11k natural compounds by comparing them with the drugs. This revealed therapeutic potential of several natural compounds, including those with support from previously published sources as well as those hitherto unexplored. We experimentally validated one of the predicted pair's activities, viz., Cox-1 inhibition by 5-methoxysalicylic acid, a molecule commonly found in tea, herbs and spices. In contrast, another natural compound, 4-isopropylbenzoic acid, with the highest similarity score when considering most weighted similarity metric but not picked by our models, did not inhibit Cox-1. Our results demonstrate the utility of a machine-learning approach combining multiple chemical features for uncovering protein binding potential of natural compounds.
Keywords
Machine Learning, Protein Binding, Proteins
Sponsorship
EMBL Interdisciplinary Postdoc (EI3POD) program under Marie Skłodowska-Curie actions (664726)
Medical Research Council (MC_UU_00025/11)
Joachim Herz Stiftung (Fellowship for Interdisciplinary Science)
Identifiers
35468126, PMC9071136
External DOI: https://doi.org/10.1371/journal.pcbi.1010029
This record's URL: https://www.repository.cam.ac.uk/handle/1810/337530
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk