Repository logo

Comparative dataset of experimental and computational attributes of UV/vis absorption spectra

Published version

Change log


Beard, Edward J. 
Vázquez-Mayagoitia, Álvaro 
Vishwanath, Venkatram 
Cole, Jacqueline M. 


Abstract: The ability to auto-generate databases of optical properties holds great prospects in data-driven materials discovery for optoelectronic applications. We present a cognate set of experimental and computational data that describes key features of optical absorption spectra. This includes an auto-generated database of 18,309 records of experimentally determined UV/vis absorption maxima, λmax, and associated extinction coefficients, ϵ, where present. This database was produced using the text-mining toolkit, ChemDataExtractor, on 402,034 scientific documents. High-throughput electronic-structure calculations using fast (simplified Tamm-Dancoff approach) and traditional (time-dependent) density functional theory were executed to predict λmax and oscillation strengths, f (related to ϵ) for a subset of validated compounds. Paired quantities of these computational and experimental data show strong correlations in λmax, f and ϵ, laying the path for reliable in silico calculations of additional optical properties. The total dataset of 8,488 unique compounds and a subset of 5,380 compounds with experimental and computational data, are available in MongoDB, CSV and JSON formats. These can be queried using Python, R, Java, and MATLAB, for data-driven optoelectronic materials discovery.


Funder: US Department of Energy, Office of Science, Office of Basic Energy Sciences, DE-AC02-06CH11357


Data Descriptor, /639/638/630, /639/301/1019, /639/638/298/398, /639/301/1034/1037, data-descriptor

Journal Title

Scientific Data

Conference Name

Journal ISSN


Volume Title



Nature Publishing Group UK
RCUK | Science and Technology Facilities Council (STFC) (ISIS Facility, ISIS Facility)
Royal Commission for the Exhibition of 1851 (2014 Design Fellowship)
Royal Academy of Engineering (RCSRF1819\7\10)