Repository logo

Scholarly Works - Unilever Centre for Molecular Informatics

Permanent URI for this collection

This contains published papers and preprints in accordance with the publishers' policies. We note the following categories (SHERPA-ROMEO and HARNAD):

ROMEO yellow; HARNAD paleGreen
Authors may archive pre-refereed material but not final documents
ROMEO blue; HARNAD brightGreen
Authors may archive post-publication documents
ROMEO green, HARNAD brightGreen
Authors may archive both of the above

We have examples of all the first two.

ROMEO white, HARNAD grey
Authors may not archive anything. A typical gray publisher is the American Chemical Society (see quote in 1810/741).

We have no examples of these.

The policy and practice of Open Access is changing rapidly during 2004 and although we will hope to update this space we cannot guarantee it. We have not knowingly offended against the details of publishers' copyrights and agreements - please contact us if you think there is a problem.


Recent Submissions

Now showing 1 - 20 of 51
  • ItemOpen Access
    Reproducible Physical Science and the Declaratron
    (Murray-Rust; Unilever Centre for Molecular Informatics, Chemistry, University of Cambridge, 2013-07-05) Murray-Rust, Peter; Murray-Rust, David
    The Declaratron is a semantic engine for formalising mathematics and science in publications
  • ItemOpen Access
    Applications of the InChI in cheminformatics with the CDK and Bioclipse.
    (Springer Science and Business Media LLC, 2013-03-13) Spjuth, Ola; Berg, Arvid; Adams, Samuel; Willighagen, Egon L
    BACKGROUND: The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. RESULTS: We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry Development Kit (CDK) cheminformatics library. To make this possible, a JNI bridge to the InChI library was developed, JNI-InChI, allowing Java software to access the InChI algorithms. By using this bridge, the CDK project packages the InChI binaries in a module and offers easy access from Java using the CDK API. The Bioclipse project packages and offers InChI as a dynamic OSGi bundle that can easily be used by any OSGi-compliant software, in addition to the regular Java Archive and Maven bundles. Bioclipse itself uses the InChI as a key component and calculates it on the fly when visualizing and editing chemical structures. We demonstrate the utility of InChI with various applications in CDK and Bioclipse, such as decision support for chemical liability assessment, tautomer generation, and for knowledge aggregation using a linked data approach. CONCLUSIONS: These results show that the InChI library can be used in a variety of Java library dependency solutions, making the functionality easily accessible by Java software, such as in the CDK. The applications show various ways the InChI has been used in Bioclipse, to enrich its functionality.
  • ItemOpen Access
    Quantifying the shifts in physicochemical property space introduced by the metabolism of small organic molecules
    (Springer Science and Business Media LLC, 2013-03) Kirchmair, Johannes; Howlett, Andrew; Peironcely, Julio; Murrell, Daniel S; Williamson, Mark; Adams, Samuel E; Hankemeier, Thomas; van Buren, Leo; Duchateau, Guus; Klaffke, Werner; Glen, Robert C; Williamson, Mark [0000-0002-5295-7811]; Glen, Robert [0000-0003-1759-2914]
  • ItemOpen Access
    In silico target prediction: identification of on- and off-targets for crop protection agents
    (Springer Science and Business Media LLC, 2013-03) Chiddarwar, Rucha K; Bender, Andreas; Rohrer, Sebastian; Bender, Andreas [0000-0002-6683-7546]
  • ItemOpen Access
    Revised classification of kinases based on bioactivity data: the importance of data density and choice of visualization
    (Springer Science and Business Media LLC, 2013-03) Paricharak, Shardul; Klenka, Tom; Augustin, Martin; Patel, Umesh A; Bender, Andreas; Bender, Andreas [0000-0002-6683-7546]
  • ItemOpen Access
    Relating GPCRs pharmacological space based on ligands chemical similarities
    (Springer Science and Business Media LLC, 2013-01) Koutsoukas, Alexios; Torella, Rubben; Drakakis, George; Bender, Andreas; Glen, Robert C; Drakakis, George [0000-0002-6635-9273]
  • ItemOpen Access
    Experimental validation of in silico target predictions on synergistic protein targets
    (Springer Science and Business Media LLC, 2013-01) Cortes-Ciriano, Isidro; Koutsoukas, Alexios; Abian, Olga; Bender, Andreas; Velazquez-Campoy, Adrian; Cortes-Ciriano, Isidro [0000-0002-2036-494X]; Abian, Olga [0000-0001-5664-1729]; Velazquez-Campoy, Adrian [0000-0001-5702-4538]
  • ItemOpen Access
    Using machine learning techniques for rationalising phenotypic readouts from a rat sleeping model
    (Springer Science and Business Media LLC, 2013-01) Drakakis, Georgios; Koutsoukas, Alexios; Brewerton, Suzanne Clare; Evans, David DE; Bender, Andreas; Drakakis, Georgios [0000-0002-6635-9273]
  • ItemOpen Access
    Chemogenomics approaches to rationalising compound action of traditional Chinese and Ayurvedic medicines
    (Springer Science and Business Media LLC, 2013-03) Fauzi, Fazlin Mohd; Koutsoukas, Alexios; Lowe, Rob; Joshi, Kalpana; Fan, Tai-Ping; Bender, Andreas; Bender, Andreas [0000-0002-6683-7546]
  • ItemOpen Access
    Semantic physical science.
    (Springer Science and Business Media LLC, 2012-08-03) Murray-Rust, Peter; Rzepa, Henry S
    The articles in this special issue arise from a workshop and symposium held in January 2012 (Semantic Physical Science'). We invited people who shared our vision for the potential of the web to support chemical and related subjects. Other than the initial invitations, we have not exercised any control over the content of the contributed articles.
  • ItemOpen Access
    Changing computational research. The challenges ahead.
    (Springer Science and Business Media LLC, 2012-05-28) Neylon, Cameron; Aerts, Jan; Brown, C Titus; Coles, Simon J; Hatton, Les; Lemire, Daniel; Millman, K Jarrod; Murray-Rust, Peter; Perez, Fernando; Saunders, Neil; Shah, Nigam; Smith, Arfon; Varoquaux, Gaël; Willighagen, Egon
    AbstractNo abstract - Editorial
  • ItemOpen Access
    Predicting the mechanism of phospholipidosis.
    (Springer Science and Business Media LLC, 2012-01-26) Lowe, Robert; Mussa, Hamse Y; Nigsch, Florian; Glen, Robert C; Mitchell, John Bo; Glen, Robert [0000-0003-1759-2914]
    The mechanism of phospholipidosis is still not well understood. Numerous different mechanisms have been proposed, varying from direct inhibition of the breakdown of phospholipids to the binding of a drug compound to the phospholipid, preventing breakdown. We have used a probabilistic method, the Parzen-Rosenblatt Window approach, to build a model from the ChEMBL dataset which can predict from a compound's structure both its primary pharmaceutical target and other targets with which it forms off-target, usually weaker, interactions. Using a small dataset of 182 phospholipidosis-inducing and non-inducing compounds, we predict their off-target activity against targets which could relate to phospholipidosis as a side-effect of a drug. We link these targets to specific mechanisms of inducing this lysosomal build-up of phospholipids in cells. Thus, we show that the induction of phospholipidosis is likely to occur by separate mechanisms when triggered by different cationic amphiphilic drugs. We find that both inhibition of phospholipase activity and enhanced cholesterol biosynthesis are likely to be important mechanisms. Furthermore, we provide evidence suggesting four specific protein targets. Sphingomyelin phosphodiesterase, phospholipase A2 and lysosomal phospholipase A1 are shown to be likely targets for the induction of phospholipidosis by inhibition of phospholipase activity, while lanosterol synthase is predicted to be associated with phospholipidosis being induced by enhanced cholesterol biosynthesis. This analysis provides the impetus for further experimental tests of these hypotheses.
  • ItemOpen Access
    CML: Evolution and design.
    (Springer Science and Business Media LLC, 2011-10-14) Murray-Rust, Peter; Rzepa, Henry S
    A retrospective view of the design and evolution of Chemical Markup Language (CML) is presented by its original authors.
  • ItemOpen Access
    The semantics of Chemical Markup Language (CML): dictionaries and conventions
    (2011-10-14) Murray-Rust, Peter; Townsend, Joe A; Adams, Sam E; Phadungsukanan, Weerapong; Thomas, Jens
    Abstract The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs.
  • ItemOpen Access
    Semantic science and its communication - a personal view.
    (Springer Science and Business Media LLC, 2011-10-14) Murray-Rust, Peter
    The articles in this special issue represent the culmination of about 15 years working with the potential of the web to support chemical and related subjects. The selection of papers arises from a symposium held in January 2011 ('Visions of a Semantic Molecular Future') which gave me an opportunity to invite many people who shared the same vision. I have asked them to contribute their papers and most have been able to do so. They cover a wide range of content, approaches and styles and apart from the selection of the speakers (and hence the authors) I have not exercised any control over the content.
  • ItemOpen Access
    Open bibliography for science, technology, and medicine.
    (Springer Science and Business Media LLC, 2011-10-14) Jones, Richard; Macgillivray, Mark; Murray-Rust, Peter; Pitman, Jim; Sefton, Peter; O'Steen, Ben; Waites, William
    The concept of Open Bibliography in science, technology and medicine (STM) is introduced as a combination of Open Source tools, Open specifications and Open bibliographic data. An Openly searchable and navigable network of bibliographic information and associated knowledge representations, a Bibliographic Knowledge Network, across all branches of Science, Technology and Medicine, has been designed and initiated. For this large scale endeavour, the engagement and cooperation of the multiple stakeholders in STM publishing - authors, librarians, publishers and administrators - is sought.
  • ItemOpen Access
    The Quixote project: Collaborative and Open Quantum Chemistry data management in the Internet age.
    (Springer Science and Business Media LLC, 2011-10-14) Adams, Sam; de Castro, Pablo; Echenique, Pablo; Estrada, Jorge; Hanwell, Marcus D; Murray-Rust, Peter; Sherwood, Paul; Thomas, Jens; Townsend, Joe
    Computational Quantum Chemistry has developed into a powerful, efficient, reliable and increasingly routine tool for exploring the structure and properties of small to medium sized molecules. Many thousands of calculations are performed every day, some offering results which approach experimental accuracy. However, in contrast to other disciplines, such as crystallography, or bioinformatics, where standard formats and well-known, unified databases exist, this QC data is generally destined to remain locally held in files which are not designed to be machine-readable. Only a very small subset of these results will become accessible to the wider community through publication.In this paper we describe how the Quixote Project is developing the infrastructure required to convert output from a number of different molecular quantum chemistry packages to a common semantically rich, machine-readable format and to build respositories of QC results. Such an infrastructure offers benefits at many levels. The standardised representation of the results will facilitate software interoperability, for example making it easier for analysis tools to take data from different QC packages, and will also help with archival and deposition of results. The repository infrastructure, which is lightweight and built using Open software components, can be implemented at individual researcher, project, organisation or community level, offering the exciting possibility that in future many of these QC results can be made publically available, to be searched and interpreted just as crystallography and bioinformatics results are today.Although we believe that quantum chemists will appreciate the contribution the Quixote infrastructure can make to the organisation and and exchange of their results, we anticipate that greater rewards will come from enabling their results to be consumed by a wider community. As the respositories grow they will become a valuable source of chemical data for use by other disciplines in both research and education.The Quixote project is unconventional in that the infrastructure is being implemented in advance of a full definition of the data model which will eventually underpin it. We believe that a working system which offers real value to researchers based on tools and shared, searchable repositories will encourage early participation from a broader community, including both producers and consumers of data. In the early stages, searching and indexing can be performed on the chemical subject of the calculations, and well defined calculation meta-data. The process of defining more specific quantum chemical definitions, adding them to dictionaries and extracting them consistently from the results of the various software packages can then proceed in an incremental manner, adding additional value at each stage.Not only will these results help to change the data management model in the field of Quantum Chemistry, but the methodology can be applied to other pressing problems related to data in computational and experimental science.
  • ItemOpen Access
    Ami - The Chemist's Amanuensis
    (2011-10-14) Brooks, Brian J; Thorn, Adam L; Smith, Matthew; Matthews, Peter; Chen, Shaoming; O'Steen, Ben; Adams, Sam E; Townsend, Joe A; Murray-Rust, Peter
    Abstract The Ami project was a six month Rapid Innovation project sponsored by JISC to explore the Virtual Research Environment space. The project brainstormed with chemists and decided to investigate ways to facilitate monitoring and collection of experimental data. A frequently encountered use-case was identified of how the chemist reaches the end of an experiment, but finds an unexpected result. The ability to replay events can significantly help make sense of how things progressed. The project therefore concentrated on collecting a variety of dimensions of ancillary data - data that would not normally be collected due to practicality constraints. There were three main areas of investigation: 1) Development of a monitoring tool using infrared and ultrasonic sensors; 2) Time-lapse motion video capture (for example, videoing 5 seconds in every 60); and 3) Activity-driven video monitoring of the fume cupboard environs. The Ami client application was developed to control these separate logging functions. The application builds up a timeline of the events in the experiment and around the fume cupboard. The videos and data logs can then be reviewed after the experiment in order to help the chemist determine the exact timings and conditions used. The project experimented with ways in which a Microsoft Kinect could be used in a laboratory setting. Investigations suggest that it would not be an ideal device for controlling a mouse, but it shows promise for usages such as manipulating virtual molecules.
  • ItemOpen Access
    Open Data, Open Source and Open Standards in chemistry: The Blue Obelisk five years on
    (2011-10-14) O'Boyle, Noel M; Guha, Rajarshi; Willighagen, Egon L; Adams, Samuel E; Alvarsson, Jonathan; Bradley, Jean-Claude; Filippov, Igor V; Hanson, Robert M; Hanwell, Marcus D; Hutchison, Geoffrey R; James, Craig A; Jeliazkova, Nina; Lang, Andrew SID; Langner, Karol M; Lonie, David C; Lowe, Daniel M; Pansanel, Jerome; Pavlov, Dmitry; Spjuth, Ola; Steinbeck, Christoph; Tenderholt, Adam L; Theisen, Kevin J; Murray-Rust, Peter
    Abstract Background The Blue Obelisk movement was established in 2005 as a response to the lack of Open Data, Open Standards and Open Source (ODOSOS) in chemistry. It aims to make it easier to carry out chemistry research by promoting interoperability between chemistry software, encouraging cooperation between Open Source developers, and developing community resources and Open Standards. Results This contribution looks back on the work carried out by the Blue Obelisk in the past 5 years and surveys progress and remaining challenges in the areas of Open Data, Open Standards, and Open Source in chemistry. Conclusions We show that the Blue Obelisk has been very successful in bringing together researchers and developers with common interests in ODOSOS, leading to development of many useful resources freely available to the chemistry community.
  • ItemOpen Access
    OSCAR4: a flexible architecture for chemical text-mining
    (2011-10-14) Jessop, David M; Adams, Sam E; Willighagen, Egon L; Hawizy, Lezan; Murray-Rust, Peter
    Abstract The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.