Show simple item record

dc.contributor.authorTrapotsi, Maria-Anna
dc.date.accessioned2022-04-12T12:03:03Z
dc.date.available2022-04-12T12:03:03Z
dc.date.submitted2021-12-24
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/336023
dc.description.abstractUnderstanding a compound’s biological effects such as its Mechanism of Action (MoA) and safety profile is a challenging task in drug discovery process. However, this understanding can facilitate drug discovery process and provide an early warning for potential risks. Biological effects understanding has been significantly facilitated by the advances in Machine Learning (ML), bioinformatic approaches and the increasing deposition of high throughput data in public databases. There are different types of information/data which can be used and as the volume of this data increases, so too does their potential to deepen our understanding. Therefore, key questions remain around which ML methodologies and which data types to use. In this thesis, the aim was to provide answers to two questions about which data and methods to use for compounds’ MoA understanding and how to explore the safety profile of new data modalities such as PROteolysis TArgeting Chimeras (PROTACs). In the first chapter, “Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty”, a novel algorithm was evaluated and benchmarked. A limiting factor in target prediction for MoA understanding is the experimental variability in bioactivity data, which are used to train target prediction models. By applying this novel algorithm, which is a modification of the long-established Random Forest (RF), and comparing it with the classic RF, a benefit was identified in the prediction of compounds which are close to the classification threshold. The next chapter, “Comparison of Structural Chemical and Cell Morphology Information for Multitask Bioactivity Predictions”, provided insights in which type of compound information is more useful in target prediction across 224 targets. The comparison was performed using cell morphology information (in the form of CellProfiler features) from a Cell Painting assay and chemical structure information in the form of Extended Connectivity Fingerprints. The comparison revealed that there were targets better predicted by cell morphology information such as the β-catenin and other better predicted by chemical structure information such as proteins belonging to the G-protein-Coupled Receptor 1 family. The final chapter, “Mitochondrial Toxicity Prediction using Cell Painting Assay on a PROTACs dataset”, explored the successful profiling of a novel data modality (PROTACs) with the Cell Painting assay and evaluated whether this profiling can be used in the understanding of the safety of those novel compounds. Cell morphology features (in the form of CellProfiler features) successfully predicted mitochondrial toxicity in a PROTACs dataset. This work resulted in the first ML model to predict PROTACs’ mitochondrial toxicity using Cell Painting-based features and expanded our knowledge for PROTACs’ safety profile prediction. In summary, the work described in this thesis has furthered the field of in-silico target deconvolution and PROTACs’ mitochondrial toxicity prediction. Firstly, the work showed that there is benefit of using Probabilistic Random Forest when there is a degree of experimental uncertainty in bioactivity data close to the classification threshold. In addition, this work highlighted targets, where the use of compounds’ cell morphology information was beneficial for target prediction and finally showed that PROTACs’ cell morphology information can be used for mitochondrial toxicity prediction.
dc.description.sponsorshipBBSRC and AstraZeneca
dc.rightsAll Rights Reserved
dc.rights.urihttps://www.rioxx.net/licenses/all-rights-reserved/
dc.subjectmechanism of action
dc.subjecttarget prediction
dc.subjectcell painting
dc.subjectPROTAC
dc.subjectPROTACs
dc.titleUsing Heterogeneous Information Sources for Understanding and Predicting Biological Effects of Compounds
dc.typeThesis
dc.type.qualificationlevelDoctoral
dc.type.qualificationnameDoctor of Philosophy (PhD)
dc.publisher.institutionUniversity of Cambridge
dc.date.updated2022-04-11T15:30:40Z
dc.identifier.doi10.17863/CAM.83454
rioxxterms.licenseref.urihttps://www.rioxx.net/licenses/all-rights-reserved/
rioxxterms.typeThesis
dc.publisher.collegeHughes Hall
pubs.funder-project-idBiotechnology and Biological Sciences Research Council (1944644)
cam.supervisorBender, Andreas
cam.supervisorEngkvist, Ola
cam.supervisorBarrett, Ian
cam.depositDate2022-04-11
pubs.licence-identifierapollo-deposit-licence-2-1
pubs.licence-display-nameApollo Repository Deposit Licence Agreement


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record