An open experimental database for exploring inorganic materials

Andriy Zakutayev, Nick Wunder, Marcus Schwarting, John D. Perkins, Robert White, Kristin Munch, William Tumas, Caleb Phillips

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.

Original languageEnglish
Article number180053
JournalScientific data
Volume5
DOIs
Publication statusPublished - Apr 3 2018

Fingerprint

High Throughput
Materials Science
Throughput
Materials science
Learning algorithms
Learning systems
Learning Algorithm
Machine Learning
Resources
Data base
Information Management
Optoelectronics
Browsing
Application programming interfaces (API)
Optoelectronic devices
Web-based
Information management
User Interface
User interfaces
Data mining

ASJC Scopus subject areas

  • Statistics and Probability
  • Information Systems
  • Education
  • Computer Science Applications
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Cite this

Zakutayev, A., Wunder, N., Schwarting, M., Perkins, J. D., White, R., Munch, K., ... Phillips, C. (2018). An open experimental database for exploring inorganic materials. Scientific data, 5, [180053]. https://doi.org/10.1038/sdata.2018.53

An open experimental database for exploring inorganic materials. / Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D.; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb.

In: Scientific data, Vol. 5, 180053, 03.04.2018.

Research output: Contribution to journalArticle

Zakutayev, A, Wunder, N, Schwarting, M, Perkins, JD, White, R, Munch, K, Tumas, W & Phillips, C 2018, 'An open experimental database for exploring inorganic materials', Scientific data, vol. 5, 180053. https://doi.org/10.1038/sdata.2018.53
Zakutayev A, Wunder N, Schwarting M, Perkins JD, White R, Munch K et al. An open experimental database for exploring inorganic materials. Scientific data. 2018 Apr 3;5. 180053. https://doi.org/10.1038/sdata.2018.53
Zakutayev, Andriy ; Wunder, Nick ; Schwarting, Marcus ; Perkins, John D. ; White, Robert ; Munch, Kristin ; Tumas, William ; Phillips, Caleb. / An open experimental database for exploring inorganic materials. In: Scientific data. 2018 ; Vol. 5.
@article{81e6169c4d9a4b97bb2078099014bedc,
title = "An open experimental database for exploring inorganic materials",
abstract = "The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.",
author = "Andriy Zakutayev and Nick Wunder and Marcus Schwarting and Perkins, {John D.} and Robert White and Kristin Munch and William Tumas and Caleb Phillips",
year = "2018",
month = "4",
day = "3",
doi = "10.1038/sdata.2018.53",
language = "English",
volume = "5",
journal = "Scientific data",
issn = "2052-4463",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - An open experimental database for exploring inorganic materials

AU - Zakutayev, Andriy

AU - Wunder, Nick

AU - Schwarting, Marcus

AU - Perkins, John D.

AU - White, Robert

AU - Munch, Kristin

AU - Tumas, William

AU - Phillips, Caleb

PY - 2018/4/3

Y1 - 2018/4/3

N2 - The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.

AB - The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.

UR - http://www.scopus.com/inward/record.url?scp=85044965649&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044965649&partnerID=8YFLogxK

U2 - 10.1038/sdata.2018.53

DO - 10.1038/sdata.2018.53

M3 - Article

VL - 5

JO - Scientific data

JF - Scientific data

SN - 2052-4463

M1 - 180053

ER -