Repository logo

Understanding compound-induced histopathology in rat liver using gene expression network methods

Thumbnail Image



Change log


Alexander-Dann, Benjamin  ORCID logo


Current drug discovery is a lengthy and costly pipeline; it takes between twelve and fifteen years and costs $1-2 billion (USD). As such, any compound failures represent a sunk cost – exacerbated if such failures occur later in the pipeline. Compound and drug induced liver injury is a significant cause of failures. Current progress to tackle this is based on systems biology, and so falls within the field of toxicogenomics. As such, the databases in the public domain are crucial to progress. DrugMatrix and Open TG-GATEs were identified as containing large, in vivo transcriptomics data with histopathological observations as endpoints, a proxy for toxicity. Due to the size and chemical variety of the databases, data-driven methods led to the novel creation of histopathology signatures, which accounts for dependence between histopathology observations (Chapter 2). Six toxic groups were determined for DrugMatrix and 13 for Open TG-GATEs, and were analysed with a view to enable classification, namely, what is revealed in the gene expression profiles when the histopathology phenotype was present. This led to determining gene-phenotype associations, both known and novel. An example of a novel association was the match of the histopathology signature of ‘glycogen accumulation, mixed infiltration and lymphocytic inflammatory cell infiltration’ to fructose metabolism, gluconeogenesis, and chemokine response pathways (Chapter 3). Concordance was found between histopathologically related toxicity groups between databases and their co-expression networks. From here, the co-expression network methods were applied to determine the concordance of gene expression across time (one day to four/five days), database (DrugMatrix and Open TG-GATEs), and toxicity group. This found underlying biological terms such as RNA transport, ribosome biogenesis and translation as well as toxicity-specific terms (aminoacyl t-RNA synthesis in metabolic processes for the histopathological observations of glycogen accumulation, cellular infiltrate, hepatocellular necrosis and fatty change in liver. Crucially, this work determined that the toxic group membership plays a more significant role in gene co-expression networks compared to the time point of the gene expression measurement (Chapter 4). In conclusion, data driven clustering was performed to create histopathology signatures. Using these, the usefulness of transcriptomics data was determined both to classify toxic state (gene expression data measured when the phenotype was present) and to determine how consistent it is over time scales. This work provided a framework for the comparison of co-expression networks for the deconvolution of gene expression data with respect to a phenotype.





Bender, Andreas


gene expression, toxicogenomics, drug development, predictive toxicology, computational chemistry, network methods


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
EPSRC (1827220)
EPSRC (1827220)