A simple spatial extension to the extended connectivity interaction features for binding affinity prediction.
Published version
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Orhobor, Oghenejokpeme I https://orcid.org/0000-0003-1178-611X
Rehim, Abbi Abdel
Lou, Hang
Ni, Hao
King, Ross D https://orcid.org/0000-0001-7208-4387
Abstract
The representation of the protein-ligand complexes used in building machine learning models play an important role in the accuracy of binding affinity prediction. The Extended Connectivity Interaction Features (ECIF) is one such representation. We report that (i) including the discretized distances between protein-ligand atom pairs in the ECIF scheme improves predictive accuracy, and (ii) in an evaluation using gradient boosted trees, we found that the resampling method used in selecting the best hyperparameters has a strong effect on predictive performance, especially for benchmarking purposes.
Description
Peer reviewed: True
Keywords
machine learning, protein binding affinity prediction, scoring functions
Journal Title
R Soc Open Sci
Conference Name
Journal ISSN
2054-5703
2054-5703
2054-5703
Volume Title
Publisher
The Royal Society
Publisher DOI
Sponsorship
Engineering and Physical Sciences Research Council (EP/R022925/1)