Repository logo
 

Modelling compound cytotoxicity using conformal prediction and PubChem HTS data

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Norinder, U 

Abstract

The assessment of compound cytotoxicity is an important part of the drug discovery process. Accurate predictions of cytotoxicity have the potential to expedite decision making and save considerable time and effort. In this work we apply class conditional conformal prediction to model the cytotoxicity of compounds based on 16 high throughput cytotoxicity assays from PubChem. The data span 16 cell lines and comprise more than 440 000 unique compounds. The data sets are heavily imbalanced with only 0.8% of the tested compounds being cytotoxic. We trained one classification model for each cell line and validated the performance with respect to validity and accuracy. The generated models deliver high quality predictions for both toxic and non-toxic compounds despite the imbalance between the two classes. On external data collected from the same assay provider as one of the investigated cell lines the model had a sensitivity of 74% and a specificity of 65% at the 80% confidence level among the compounds assigned to a single class. Compared to previous approaches for large scale cytotoxicity modelling, this represents a balanced performance in the prediction of the toxic and non-toxic classes. The conformal prediction framework also allows the modeller to control the error frequency of the predictions, allowing predictions of cytotoxicity outcomes with confidence.

Description

Keywords

0801 Artificial Intelligence and Image Processing, Generic Health Relevance

Journal Title

Toxicology Research

Conference Name

Journal ISSN

2045-452X
2045-4538

Volume Title

6

Publisher

Royal Society of Chemistry
Sponsorship
FS acknowledges the Swedish Pharmaceutical Society for financial support.