This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

matminer is an open-source Python library for performing data mining and analysis in the field of Materials Science. It is meant to make accessible the application of state-of-the-art statistical and machine learning algorithms to materials science data with just a *few* lines of code. It is currently in development, however it is a **working code**.

Citing matminer

We are currently in the process of writing a paper on matminer - we will update the citation information once it is submitted.

Example notebooks

A few examples demonstrating some of the features available in matminer have been created in the form of Jupyter notebooks:

(Note: the Jupyter (Binder) links below are recommended as Github does not render interactive Javascript code or images.)

1. Get all experimentally measured band gaps of PbTe from Citrine's database: `Jupyter <http:"" repo="" hackingmaterials="" matminer="" notebooks="" example_notebooks="" get_citrine_experimental_bandgaps_pbte.ipynb="">`_ `Github <https:"" hackingmaterials="" matminer="" blob="" master="" example_notebooks="" get_citrine_experimental_bandgaps_pbte.ipynb="">`_

2. Compare and plot experimentally band gaps from Citrine with computed values from the Materials Project: `Jupyter <http:"" repo="" hackingmaterials="" matminer="" notebooks="" example_notebooks="" experiment_vs_computed_bandgap.ipynb="">`_ `Github <https:"" hackingmaterials="" matminer="" blob="" master="" example_notebooks="" experiment_vs_computed_bandgap.ipynb="">`_

3. Train and predict band gaps using matminer's tools to retrieve computed band gaps and descriptors from the Materials Project, and composition descriptors from pymatgen: `Jupyter <http:"" repo="" hackingmaterials="" matminer="" notebooks="" example_notebooks="" machine_learning_to_predict_bandgap.ipynb="">`_ `Github <https:"" hackingmaterials="" matminer="" blob="" master="" example_notebooks="" machine_learning_to_predict_bandgap.ipynb="">`_

4. Training and predict bulk moduli using matminer's tools to retrieve computed bulk moduli and descriptors from the Materials Project, and composition descriptors from pymatgen: `Jupyter <http:"" repo="" hackingmaterials="" matminer="" notebooks="" example_notebooks="" machine_learning_to_predict_bulkmodulus.ipynb="">`_ `Github <https:"" hackingmaterials="" matminer="" blob="" master="" example_notebooks="" machine_learning_to_predict_bulkmodulus.ipynb="">`_

You can also use the `Binder <http:""/>`_ service (in beta) to launch an interactive notebook upon a click. Click the button below to open the tree structure of this repository and navigate to the folder **example_notebooks** in the current working directory to use/edit the above notebooks right away! To open/run/edit other notebooks, go to "File->Open" within the page and navigate to the notebook of your choice.

.. image::


There are a couple of quick and easy ways to install matminer:-

- **Quick install**

(Note: this may not install the latest changes to matminer. To install the version with the latest commits, skip to the next steps)

For a quick install of matminer, and all of its dependencies, simply run the command in a bash terminal:

.. code-block:: bash

$ pip install matminer

or, to install matminer in your user $HOME folder, run the command:

.. code-block:: bash

$ pip install matminer --user

One way to obtain :code:`pip` if not already installed is through :code:`conda`, which is useful when you are working with many python packages and want to use separate configuration settings and environment for each package. You can then install matminer and packages required by it in its own environment. Some useful links are `here <https:"" eresearch-cookbook="" recipe="" 2014="" 11="" 20="" conda=""/>`_ and `here <http:"" docs="" using="" index.html="">`_.

- **Install in developmental mode**

To install the full and latest source of the matminer code in developmental mode, along with its important dependencies, clone the Git source in a folder of your choosing by entering the following command:

.. code-block:: bash

$ git clone

and then entering the cloned repository/folder to install in developer mode:

.. code-block:: bash

$ cd matminer
$ python develop

Depending on how many of the required dependencies were already installed on your system, you will see a few or many warnings, but everything should be installed successfully.


Below is a general workflow that shows the different tools and utilities available within matminer, and how they could be implemented with each other, as well as with external libraries, in your own materials data mining/analysis study.

.. image::
:align: center

Here's a brief description of the available tools (please find implementation examples in a dedicated section elsewhere in this document):

Data retrieval tools

- Retrieve data from the biggest materials databases, such as the Materials Project and Citrine's databases, in a Pandas dataframe format

The `MPDataRetrieval <https:"" hackingmaterials="" matminer="" blob="" master="" matminer="" data_retrieval="""">`_ and `CitrineDataRetrieval <https:"" hackingmaterials="" matminer="" blob="" master="" matminer="" data_retrieval="""">`_ classes can be used to retrieve data from the biggest open-source materials database collections of the `Materials Project <https:""/>`_ and `Citrine Informatics <https:""/>`_, respectively, in a `Pandas <http:""/>`_ dataframe format. The data contained in these databases are a variety of material properties, obtained in-house or from other external databases, that are either calculated, measured from experiments, or learned from trained algorithms. The :code:`get_dataframe` method of these classes executes the data retrieval by searching the respective database using user-specified filters, such as compound/material, property type, etc , extracting the selected data in a JSON/dictionary format through the API, parsing it and output the result to a Pandas dataframe with columns as properties/features measured or calculated and rows as data points.

For example, to compare experimental and computed band gaps of Si, one can employ the following lines of code:

.. code-block:: python

from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval
from matminer.data_retrieval.retrieve_MP import MPDataRetrieval

df_citrine = CitrineDataRetrieval().get_dataframe(formula='Si', property='band gap',
df_mp = MPDataRetrieval().get_dataframe(criteria='Si', properties=['band_gap'])

`MongoDataRetrieval <https:"" hackingmaterials="" matminer="" blob="" master="" matminer="" data_retrieval="""">`_ is another data retrieval tool developed that allows for the parsing of any `MongoDB <https:""/>`_ collection (which follows a flexible JSON schema), into a Pandas dataframe that has a format similar to the output dataframe from the above data retrieval tools. The arguments of the :code:`get_dataframe` method allow to utilize MongoDB's rich and powerful query/aggregation syntax structure. More information on customization of queries can be found in the `MongoDB documentation <https:"" manual=""/>`_.

Data descriptor tools

- Decorate the dataframe with composition, structural, and/or band structure descriptors/features

In this module of the matminer library, we have developed utilities to help describe the material by their composition or structure, and represent them in a numeric format such that they are readily usable as features in a data analysis study to predict a target value.

The :code:`get_pymatgen_descriptor` function is used to encode a material's composition using tabulated elemental properties in the `pymatgen <http:"" _modules="" pymatgen="" core="" periodic_table.html="">`_ library. There are about 50 attributes available in the pymatgen library for most elements in the periodic table, some of which include electronegativity, atomic numbers, atomic masses, sound velocity, boiling point, etc. The :code:`get_pymatgen_descriptor` function takes as input a material composition and name of the desired property, and returns a list of floating point property values for each atom in that composition. This list can than be fed into a statistical function to obtain a single heuristic quantity representative of the entire composition. The following code block shows a few example descriptors that can be obtained for LiFePO\ :sub:`4`:

.. code-block:: python

from matminer.descriptors.composition_features import get_pymatgen_descriptor
import numpy as np

avg_mass = np.mean(get_pymatgen_descriptor('LiFePO4', 'atomic_mass')) # Average atomic mass
std_num = np.std(get_pymatgen_descriptor('LiFePO4', 'Z')) # Standard deviation of atomic numbers
range_elect = max(get_pymatgen_descriptor('LiFePO4', 'X')) - \
min(get_pymatgen_descriptor('LiFePO4', 'X')) # Maximum difference in electronegativity

The function :code:`get_magpie_descriptor` operates in a similar way and obtains its data from the tables accumulated in the `Magpie repository <https:"" wolverton="" magpie="">`_, some of which are sourced from elemental data compiled by Mathematica (more information can be found `here <https:"" language="" ref="" elementdata.html="">`_). Some properties that don't overlap with the pymatgen library include heat capacity, enthalpy of fusion of elements at melting points, pseudopotential radii, etc.

Some other descriptors that can be obtained from matminer include:

#. Composition descriptors

#. Cohesive energy
#. Band center

#. Structural descriptors

#. Packing fraction
#. Volume per site
#. Radial and electronic radial distribution functions

#. Band-structure descriptors

#. Branch point energy
#. Absolute band positions

#. Mechanical properties

#. Thermal stress
#. Fracture toughness
#. Brittleness index
#. Critical stress
#. bulk/elastic, rigid, and shear moduli
#. bulk modulus from coordination number
#. Vicker's hardness
#. Lame's first parameter
#. p-wave modulus
#. Sound velocity from elastic constants
#. Steady-state and maximum allowed heatflow
#. Strain energy release rate

#. Thermal condutivity models

#. Cahill model
#. Clarke model
#. Callaway model
#. Slack model
#. Keyes model

Release History

Release History


This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
matminer-0.0.7.tar.gz (8.9 MB) Copy SHA256 Checksum SHA256 Source Dec 23, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting