BatteryBERT: A Pretrained Language Model for Battery Database Enhancement.
Publication Date
2022-12-26Journal Title
J Chem Inf Model
ISSN
1549-9596
Publisher
American Chemical Society (ACS)
Type
Article
This Version
AM
Metadata
Show full item recordCitation
Huang, S., & Cole, J. M. (2022). BatteryBERT: A Pretrained Language Model for Battery Database Enhancement.. J Chem Inf Model https://doi.org/10.1021/acs.jcim.2c00035
Abstract
A great number of scientific papers are published every year in the field of battery research, which forms a huge textual data source. However, it is difficult to explore and retrieve useful information efficiently from these large unstructured sets of text. The Bidirectional Encoder Representations from Transformers (BERT) model, trained on a large data set in an unsupervised way, provides a route to process the scientific text automatically with minimal human effort. To this end, we realized six battery-related BERT models, namely, BatteryBERT, BatteryOnlyBERT, and BatterySciBERT, each of which consists of both cased and uncased models. They have been trained specifically on a corpus of battery research papers. The pretrained BatteryBERT models were then fine-tuned on downstream tasks, including battery paper classification and extractive question-answering for battery device component classification that distinguishes anode, cathode, and electrolyte materials. Our BatteryBERT models were found to outperform the original BERT models on the specific battery tasks. The fine-tuned BatteryBERT was then used to perform battery database enhancement. We also provide a website application for its interactive use and visualization.
Sponsorship
BASF/Royal Academy of Engineering, Christ College, Cambridge, DOE (contract No. DEAC02-06CH11357).
Funder references
Royal Academy of Engineering (RAEng) (RCSRF1819\7\10)
STFC (Unknown)
Embargo Lift Date
2023-05-09
Identifiers
External DOI: https://doi.org/10.1021/acs.jcim.2c00035
This record's URL: https://www.repository.cam.ac.uk/handle/1810/336285
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk