Scholarly Works - Unilever Centre for Molecular Informatics

Permanent URI for this collection

https://www.repository.cam.ac.uk/handle/1810/739

This contains published papers and preprints in accordance with the publishers' policies. We note the following categories (SHERPA-ROMEO and HARNAD):

ROMEO yellow; HARNAD paleGreen: Authors may archive pre-refereed material but not final documents
ROMEO blue; HARNAD brightGreen: Authors may archive post-publication documents
ROMEO green, HARNAD brightGreen: Authors may archive both of the above

We have examples of all the first two.

ROMEO white, HARNAD grey: Authors may not archive anything. A typical gray publisher is the American Chemical Society (see quote in 1810/741).

We have no examples of these.

The policy and practice of Open Access is changing rapidly during 2004 and although we will hope to update this space we cannot guarantee it. We have not knowingly offended against the details of publishers' copyrights and agreements - please contact us if you think there is a problem.

Browse

Now showing 1 - 20 of 51

Open Access
Reproducible Physical Science and the Declaratron
(Murray-Rust; Unilever Centre for Molecular Informatics, Chemistry, University of Cambridge, 2013-07-05) Murray-Rust, Peter; Murray-Rust, David
The Declaratron is a semantic engine for formalising mathematics and science in publications
Open Access
Applications of the InChI in cheminformatics with the CDK and Bioclipse.
(Springer Science and Business Media LLC, 2013-03-13) Spjuth, Ola; Berg, Arvid; Adams, Samuel; Willighagen, Egon L
BACKGROUND: The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. RESULTS: We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To make this possible, a JNI bridge to the InChI library was developed, JNI-InChI, allowing Java software to access the InChI algorithms. By using this bridge, the CDK project packages the InChI binaries in a module and offers easy access from Java using the CDK API. The Bioclipse project packages and offers InChI as a dynamic OSGi bundle that can easily be used by any OSGi-compliant software, in addition to the regular Java Archive and Maven bundles. Bioclipse itself uses the InChI as a key component and calculates it on the fly when visualizing and editing chemical structures. We demonstrate the utility of InChI with various applications in CDK and Bioclipse, such as decision support for chemical liability assessment, tautomer generation, and for knowledge aggregation using a linked data approach. CONCLUSIONS: These results show that the InChI library can be used in a variety of Java library dependency solutions, making the functionality easily accessible by Java software, such as in the CDK. The applications show various ways the InChI has been used in Bioclipse, to enrich its functionality.
Open Access
Quantifying the shifts in physicochemical property space introduced by the metabolism of small organic molecules
(Springer Science and Business Media LLC, 2013-03) Kirchmair, Johannes; Howlett, Andrew; Peironcely, Julio; Murrell, Daniel S; Williamson, Mark; Adams, Samuel E; Hankemeier, Thomas; van Buren, Leo; Duchateau, Guus; Klaffke, Werner; Glen, Robert C; Williamson, Mark [0000-0002-5295-7811]; Glen, Robert [0000-0003-1759-2914]
Open Access
In silico target prediction: identification of on- and off-targets for crop protection agents
(Springer Science and Business Media LLC, 2013-03) Chiddarwar, Rucha K; Bender, Andreas; Rohrer, Sebastian; Bender, Andreas [0000-0002-6683-7546]
Open Access
Revised classification of kinases based on bioactivity data: the importance of data density and choice of visualization
(Springer Science and Business Media LLC, 2013-03) Paricharak, Shardul; Klenka, Tom; Augustin, Martin; Patel, Umesh A; Bender, Andreas; Bender, Andreas [0000-0002-6683-7546]
Open Access
Relating GPCRs pharmacological space based on ligands chemical similarities
(Springer Science and Business Media LLC, 2013-01) Koutsoukas, Alexios; Torella, Rubben; Drakakis, George; Bender, Andreas; Glen, Robert C; Drakakis, George [0000-0002-6635-9273]
Open Access
Experimental validation of in silico target predictions on synergistic protein targets
(Springer Science and Business Media LLC, 2013-01) Cortes-Ciriano, Isidro; Koutsoukas, Alexios; Abian, Olga; Bender, Andreas; Velazquez-Campoy, Adrian; Cortes-Ciriano, Isidro [0000-0002-2036-494X]; Abian, Olga [0000-0001-5664-1729]; Velazquez-Campoy, Adrian [0000-0001-5702-4538]
Open Access
Using machine learning techniques for rationalising phenotypic readouts from a rat sleeping model
(Springer Science and Business Media LLC, 2013-01) Drakakis, Georgios; Koutsoukas, Alexios; Brewerton, Suzanne Clare; Evans, David DE; Bender, Andreas; Drakakis, Georgios [0000-0002-6635-9273]
Open Access
Chemogenomics approaches to rationalising compound action of traditional Chinese and Ayurvedic medicines
(Springer Science and Business Media LLC, 2013-03) Fauzi, Fazlin Mohd; Koutsoukas, Alexios; Lowe, Rob; Joshi, Kalpana; Fan, Tai-Ping; Bender, Andreas; Bender, Andreas [0000-0002-6683-7546]
Open Access
Semantic physical science.
(Springer Science and Business Media LLC, 2012-08-03) Murray-Rust, Peter; Rzepa, Henry S
The articles in this special issue arise from a workshop and symposium held in January 2012 (Semantic Physical Science'). We invited people who shared our vision for the potential of the web to support chemical and related subjects. Other than the initial invitations, we have not exercised any control over the content of the contributed articles.
Open Access
Changing computational research. The challenges ahead.
(Springer Science and Business Media LLC, 2012-05-28) Neylon, Cameron; Aerts, Jan; Brown, C Titus; Coles, Simon J; Hatton, Les; Lemire, Daniel; Millman, K Jarrod; Murray-Rust, Peter; Perez, Fernando; Saunders, Neil; Shah, Nigam; Smith, Arfon; Varoquaux, Gaël; Willighagen, Egon
AbstractNo abstract - Editorial
Open Access
Predicting the mechanism of phospholipidosis.
(Springer Science and Business Media LLC, 2012-01-26) Lowe, Robert; Mussa, Hamse Y; Nigsch, Florian; Glen, Robert C; Mitchell, John Bo; Glen, Robert [0000-0003-1759-2914]
The mechanism of phospholipidosis is still not well understood. Numerous different mechanisms have been proposed, varying from direct inhibition of the breakdown of phospholipids to the binding of a drug compound to the phospholipid, preventing breakdown. We have used a probabilistic method, the Parzen-Rosenblatt Window approach, to build a model from the ChEMBL dataset which can predict from a compound's structure both its primary pharmaceutical target and other targets with which it forms off-target, usually weaker, interactions. Using a small dataset of 182 phospholipidosis-inducing and non-inducing compounds, we predict their off-target activity against targets which could relate to phospholipidosis as a side-effect of a drug. We link these targets to specific mechanisms of inducing this lysosomal build-up of phospholipids in cells. Thus, we show that the induction of phospholipidosis is likely to occur by separate mechanisms when triggered by different cationic amphiphilic drugs. We find that both inhibition of phospholipase activity and enhanced cholesterol biosynthesis are likely to be important mechanisms. Furthermore, we provide evidence suggesting four specific protein targets. Sphingomyelin phosphodiesterase, phospholipase A2 and lysosomal phospholipase A1 are shown to be likely targets for the induction of phospholipidosis by inhibition of phospholipase activity, while lanosterol synthase is predicted to be associated with phospholipidosis being induced by enhanced cholesterol biosynthesis. This analysis provides the impetus for further experimental tests of these hypotheses.
Open Access
CML: Evolution and design.
(Springer Science and Business Media LLC, 2011-10-14) Murray-Rust, Peter; Rzepa, Henry S
A retrospective view of the design and evolution of Chemical Markup Language (CML) is presented by its original authors.
Open Access
The semantics of Chemical Markup Language (CML): dictionaries and conventions
(2011-10-14) Murray-Rust, Peter; Townsend, Joe A; Adams, Sam E; Phadungsukanan, Weerapong; Thomas, Jens
Abstract The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs.
Open Access
Semantic science and its communication - a personal view.
(Springer Science and Business Media LLC, 2011-10-14) Murray-Rust, Peter
The articles in this special issue represent the culmination of about 15 years working with the potential of the web to support chemical and related subjects. The selection of papers arises from a symposium held in January 2011 ('Visions of a Semantic Molecular Future') which gave me an opportunity to invite many people who shared the same vision. I have asked them to contribute their papers and most have been able to do so. They cover a wide range of content, approaches and styles and apart from the selection of the speakers (and hence the authors) I have not exercised any control over the content.
Open Access
Open bibliography for science, technology, and medicine.
(Springer Science and Business Media LLC, 2011-10-14) Jones, Richard; Macgillivray, Mark; Murray-Rust, Peter; Pitman, Jim; Sefton, Peter; O'Steen, Ben; Waites, William
The concept of Open Bibliography in science, technology and medicine (STM) is introduced as a combination of Open Source tools, Open specifications and Open bibliographic data. An Openly searchable and navigable network of bibliographic information and associated knowledge representations, a Bibliographic Knowledge Network, across all branches of Science, Technology and Medicine, has been designed and initiated. For this large scale endeavour, the engagement and cooperation of the multiple stakeholders in STM publishing - authors, librarians, publishers and administrators - is sought.
Open Access
The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age.
(Springer Science and Business Media LLC, 2011-10-14) Adams, Sam; de Castro, Pablo; Echenique, Pablo; Estrada, Jorge; Hanwell, Marcus D; Murray-Rust, Peter; Sherwood, Paul; Thomas, Jens; Townsend, Joe
Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication.In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today.Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education.The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage.Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science.
Open Access
Ami - The Chemist's Amanuensis
(2011-10-14) Brooks, Brian J; Thorn, Adam L; Smith, Matthew; Matthews, Peter; Chen, Shaoming; O'Steen, Ben; Adams, Sam E; Townsend, Joe A; Murray-Rust, Peter
Abstract The Ami project was a six month Rapid Innovation project sponsored by JISC to explore the Virtual Research Environment space. The project brainstormed with chemists and decided to investigate ways to facilitate monitoring and collection of experimental data. A frequently encountered use-case was identified of how the chemist reaches the end of an experiment, but finds an unexpected result. The ability to replay events can significantly help make sense of how things progressed. The project therefore concentrated on collecting a variety of dimensions of ancillary data - data that would not normally be collected due to practicality constraints. There were three main areas of investigation: 1) Development of a monitoring tool using infrared and ultrasonic sensors; 2) Time-lapse motion video capture (for example, videoing 5 seconds in every 60); and 3) Activity-driven video monitoring of the fume cupboard environs. The Ami client application was developed to control these separate logging functions. The application builds up a timeline of the events in the experiment and around the fume cupboard. The videos and data logs can then be reviewed after the experiment in order to help the chemist determine the exact timings and conditions used. The project experimented with ways in which a Microsoft Kinect could be used in a laboratory setting. Investigations suggest that it would not be an ideal device for controlling a mouse, but it shows promise for usages such as manipulating virtual molecules.
Open Access
Open Data, Open Source and Open Standards in chemistry: The Blue Obelisk five years on
(2011-10-14) O'Boyle, Noel M; Guha, Rajarshi; Willighagen, Egon L; Adams, Samuel E; Alvarsson, Jonathan; Bradley, Jean-Claude; Filippov, Igor V; Hanson, Robert M; Hanwell, Marcus D; Hutchison, Geoffrey R; James, Craig A; Jeliazkova, Nina; Lang, Andrew SID; Langner, Karol M; Lonie, David C; Lowe, Daniel M; Pansanel, Jerome; Pavlov, Dmitry; Spjuth, Ola; Steinbeck, Christoph; Tenderholt, Adam L; Theisen, Kevin J; Murray-Rust, Peter
Abstract Background The Blue Obelisk movement was established in 2005 as a response to the lack of Open Data, Open Standards and Open Source (ODOSOS) in chemistry. It aims to make it easier to carry out chemistry research by promoting interoperability between chemistry software, encouraging cooperation between Open Source developers, and developing community resources and Open Standards. Results This contribution looks back on the work carried out by the Blue Obelisk in the past 5 years and surveys progress and remaining challenges in the areas of Open Data, Open Standards, and Open Source in chemistry. Conclusions We show that the Blue Obelisk has been very successful in bringing together researchers and developers with common interests in ODOSOS, leading to development of many useful resources freely available to the chemistry community.
Open Access
OSCAR4: a flexible architecture for chemical text-mining
(2011-10-14) Jessop, David M; Adams, Sam E; Willighagen, Egon L; Hawizy, Lezan; Murray-Rust, Peter
Abstract The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.

Browse

Recent Submissions