Repository logo
 

Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction.

Published version
Peer-reviewed

Change log

Authors

Court, Callum J 
Cole, Jacqueline M 

Abstract

Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery.

Description

Keywords

0302 Inorganic Chemistry

Journal Title

Sci Data

Conference Name

Journal ISSN

2052-4463
2052-4463

Volume Title

5

Publisher

Springer Science and Business Media LLC
Sponsorship
Royal Commission for the Exhibition of 1851 (DF/05/14)
Engineering and Physical Sciences Research Council (EP/L015552/1)