Capturing chemistry in XML.

Townsend, JA 
Murray-Rust, P 

Chemical Markup Language (CML) is an XML-conformant Schema that describes molecules, spectra, reactions, and computational chemistry. It is capable of capturing the chemistry in a variety of current publications and is becoming adopted by many organizations.

We have developed tools for batch conversion of current chemical documents such as primary journal publications and theses into conformant CML. The parser reads many text and molecular formats and extracts chemical concepts into CML that are combined to give a single XML file.

The process works well for methodology and analytical data in organic synthesis. The results are stored in an XML database where they can be queried on molecular identity and numeric quantities.

Parsers can also capture the output of computational chemistry to extract essentially all of the information in the logfile. XML stylesheets can then be used to filter and display the results in an interactive manner.


ACS Spring Conference

