Synthesis Genome for Inorganic Materials: Case Oriented Proposal

Abstract

Successes in accelerated materials design, made possible in part through the Materials Genome Initiative, have shifted the bottleneck in materials development towards the synthesis of novel compounds. Existing databases do not contain information about the synthesis recipes necessary to produce compounds. As a result, much of the momentum and efficiency gained in the design process becomes gated by trial-and-error synthesis techniques. This delay in going from promising materials concept to validation, optimization, and scale-up is a significant burden to the commercialization of novel materials. The proposed research will develop the framework to do for materials synthesis what modern computational methods have done for materials properties: Build predictive tools for synthesis so that targeted compounds can be synthesized in a matter of days, rather than months or years. This work will advance techniques required for pattern identification within synthesis methods and the subsequent integration with first principles approaches. To achieve these objectives researchers will pursue an innovative approach leveraging documentation of compound synthesis compiled over decades of scientific work by using natural language processing (NLP) techniques to automatically extract synthesis methods from hundreds of thousands of peer-reviewed papers. The outcome will be an unparalleled data set of materials synthesis methods, to be made available to the materials community, which will be mined for patterns and relations to thermochemical data, ultimately allowing for rapid suggestion of synthesis methods for new compounds. This novel approach builds on established synthesis knowledge, and combines it with modern data extraction, materials informatics, text mining and machine learning techniques, and high-throughput ab-initio thermochemical data availability. The novel integration of these different fields will provide a direct route towards more rational design of synthesis methods and thereby significantly accelerate the deployment and testing of new materials concepts. The extracted information will be mined using a novel combination of machine learning tools from the materials informatics community as well as natural language parsing tools (as the database contains semantic as well as quantitative information). Becau (described subsequently) leverages expertise from the NLP perspective and the target material classification leverages expertise from the materials perspective, there is significant intellectual merit found in this interdisciplinary approach, a partnership not previously pursued to further materials design. The solution to so many of today s challenges depends on being able to analyze large quantities of data (much of it often in text) in order to overcome information overload and support effective decision-making. By developing and evaluating new methods of extracting structured information from unstructured text as well as making use of this information to further materials development, we will enable more capable and accurate tools for mining, pattern analysis and decision-support. Not only will this work provide infrastructure to enable advances in materials to improve the applications for which they were designed at a much faster rate, but also the additional example of use of text extraction to further scientific progress can be extended to other applications and fields. This proposal focuses on database development for several applications including metal alloys, solid state electrolytes and sodium ion batteries, and ultra-low resistivity candidates and metal insulator transitions candidate

Document Details

Document Type
DoD Grant Award
Publication Date
Apr 29, 2020
Source ID
N000142012280

Entities

People

  • Elsa A Olivetti

Organizations

  • Massachusetts Institute of Technology
  • Office of Naval Research
  • United States Navy

Tags

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development
  • Nanocomposite Materials Science

Technology Areas

  • AI & ML
  • Microelectronics