Recently for the purpose of designing new materials which will be useful for a range of applications as well as from energy and electronics to aeronautics and civil engineering, a wealth of computational tools have been produced by the team of researchers. But this process of developing has continued to depend on a combination of experience, intuition and manual literature reviews for the purpose of producing those materials.
The team of researchers is hoping that material science automation gap will be closed with the help of new artificial system which will be pore through research papers in order to deduce recipes for producing particular materials. In the field of computer and science engineering, numerous progress have been made by the computational materials scientists regarding what to make and what kinds of materials can be designed based on desired properties etc.
A database has been envisioned by the team of researchers in which those kinds of material recipes are contained which have been extracted from millions of papers. In order to realize that vision, a step forward has been taken by the researchers by developing a machine learning system that has the ability to analyze a research paper in which the material recipes have been contained in paragraphs as well as there is a classification of works in those paragraph according to their roles in recipes, names of target materials and pieces of equipment, numeric quantities, operating conditions, descriptive adjectives and the like.
It has been also demonstrated in that paper that this machine learning system has the ability to analyze the extracted data in order to infer general characteristics of classes of materials, for example: the different physical forms they will take when the fabrication condition vary. This system has been trained by the researchers by using a combination of supervised and unsupervised machine learning techniques.
The term “supervised” means that the training data fed to the system will be first annotated by humans when the system tries to find the correlations between the raw data and the annotations. The term “unsupervised” means that the training data is unannotated and the system according to the structural similarities learns to cluster data together as the materials recipe extraction is a new area of research.
There is a pretty small data set by machine learning standards and in order to develop it, an algorithm has been used by the researchers. With the help of this algorithm, the researchers were able for expending their training set, since the machine learning system can infer that a label attached to any given words was likely to apply to other words clustered with it.