Repository logo
 

Confidence in Inactive and Active Predictions from Structural Alerts.

Accepted version
Peer-reviewed

No Thumbnail Available

Type

Article

Change log

Authors

Goodman, Jonathan M  ORCID logo  https://orcid.org/0000-0002-8693-9136
Gutsell, Steve 
Kukic, Predrag 

Abstract

Having a measure of confidence in computational predictions of biological activity from in silico tools is vital when making predictions for new chemicals, for example, in chemical risk assessment. Where predictions of biological activity are used as an indicator of a potential hazard, false-negative predictions are the most concerning prediction; however, assigning confidence in inactive predictions is particularly challenging. How can one confidently identify the absence of activating features? In this study, we present methods for assigning confidence to both active and inactive predictions from structural alerts for protein-binding molecular initiating events (MIEs). Structural alerts were derived through an iterative statistical method. Confidence in the activity predictions is assigned by measuring the Tanimoto similarity between Morgan fingerprints of chemicals in the test set to relevant chemicals in the training set, and suitable cutoff values have been defined to give different confidence categories. To avoid a potential compound series bias in the test set and hence overestimate the performance of the method, we measured the biological activity of 27 compounds with 24 proteins, which gave us an additional 648 experimental measurements; many of the measurements are currently nonexistent in the literature and databases. This data set was complemented with newly measured biological activities published in ChEMBL25 and formed a combined independent validation data set. Applying the confidence categories to the computational predictions for the new data leads to the identification of chemicals for which one should be confident of either an inactive or active prediction, allowing model predictions to be used responsibly.

Description

Keywords

Databases, Factual, Molecular Structure, Organic Chemicals, Proteins

Journal Title

Chem Res Toxicol

Conference Name

Journal ISSN

0893-228X
1520-5010

Volume Title

33

Publisher

American Chemical Society (ACS)

Rights

All rights reserved
Sponsorship
Unilever