OSCAR4: a flexible architecture for chemical text-mining.


Change log
Authors
Jessop, David M 
Adams, Sam E 
Willighagen, Egon L 
Hawizy, Lezan 
Murray-Rust, Peter 
Abstract

The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.

Description
Keywords
SOURCE JAVA LIBRARY, DEVELOPMENT KIT CDK, INFORMATION, SMILES, NAMES
Journal Title
J Cheminform
Conference Name
Journal ISSN
1758-2946
1758-2946
Volume Title
Publisher
Springer Science and Business Media LLC
Sponsorship
We gratefully acknowledge OMII-UK, JISC (ChETA project) and EPSRC (Sciborg, Pathways to Impact awards) for funding.