Cheminformatics for genome-scale metabolic reconstructions
Genome-scale metabolic reconstructions are an important resource in the study of metabolism. They provide both a system and component level view of the biochemical transformations of metabolites. As more reconstructions have been created it remains a challenge to integrate and reason about their contents. This thesis focuses on the development of computational methods to allow on-demand comparison and alignment of metabolic reconstructions.
A novel method is introduced that utilises chemical structure representations to identify equivalent metabolites between reconstructions. Using a graph theoretic representation allows the identification and reasoning of metabolites that have a non-exact match. A key advantage is that the method uses the contents of reconstructions directly and does not rely on the creation or use of a common reference.
To annotate reconstructions with chemical structure representations an interactive desktop application is introduced. The application assists in the creation and curation of metabolic information using manual, semi-auto-mated, and automated methods. Chemical structure representations can be retrieved, drawn, or generated to allow precise metabolite annotation.
In processing chemical information, efficient and optimised algorithms are required. Several areas are addressed and implementations have been contributed to the Chemistry Development Kit. Rings are a fundamental property of chemical structures therefore multiple ring definitions and fast algorithms are explored. Conversion and standardisation between structure representations present a challenge. Efficient algorithms to determine aromaticity, assign a Kekulé form, and generate tautomers are detailed.
Many enzymes are selective and specific to stereochemistry. Methods for the identification, depiction, comparison, and description of stereochemistry are described.