Skip to main content

Positive and Unlabeled Materials Machine Learning (pumml) is a code that uses semi-supervised positive and unlabeled (PU) machine learning to classify materials when data is incomplete and only examples of 'positive' materials are available.

Project description

Build Status Coverage Status Python 3.6

pumml

Positive and Unlabeled Materials Machine Learning (pumml) is a code that uses semi-supervised positive and unlabeled (PU) machine learning to classify materials when data is incomplete and only examples of "positive" materials are available. As an example, pumml was used to predict the "synthesizability" of bulk and 2D materials from "positive" examples of synthesized materials.

How to cite pumml

If you use pumml in your research, please cite the following work:

Nathan C. Frey, Jin Wang, Gabriel Iván Vega Bellido, Babak Anasori, Yury Gogotsi, and Vivek B. Shenoy. Prediction of Synthesis of 2D Metal Carbides and Nitrides (MXenes) and Their Precursors with Positive and Unlabeled Machine Learning. ACS Nano 2019 13 (3), 3031-3041.

Please also consider citing the original works that establish the underlying methodology of pumml:

Elkan, Charles, and Keith Noto. Learning classifiers from only positive and unlabeled data. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008.

Mordelet, F.; Vert, J.-P. A Bagging SVM to Learn from Positive and Unlabeled Examples. Pattern Recognit. Lett. 2014, 37, 201−209.

Getting pumml

The easiest way to get started with pumml is to create a virtual environment with python3.6 and then pip install pumml

You can also create a virtual environment, clone this repo and do python setup.py install in the root directory.

Using pumml

An example Jupyter notebook called example_nb.ipynb shows the basic functionality of the package.

About pumml

More information about using PU learning for materials synthesis prediction can be found in our publication: DOI: 10.1021/acsnano.8b08014 https://pubs.acs.org/doi/abs/10.1021/acsnano.8b08014

Helpful PU learning wrappers for scikit-learn can be found at: Alexandre Drouin, pu-learning, 2013, https://github.com/aldro61/pu-learning

In addition to our transductive bagging scheme with decision tree base classifiers, we recommend the robust ensemble of support vector machines (RESVM) method introduced by Claesen et al. RESVM is an alternative PU learning method that provides an excellent benchmark. It is implemented here: Marc Claesen, EnsembleSVM, 2014, https://github.com/claesenm/EnsembleSVM and a python wrapper is available here: Marc Claesen, resvm, 2014, https://github.com/claesenm/resvm.

License

This code is made available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pumml-0.0.1.tar.gz (8.7 kB view details)

Uploaded Source

Built Distributions

pumml-0.0.1-py3.6.egg (16.8 kB view details)

Uploaded Source

pumml-0.0.1-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file pumml-0.0.1.tar.gz.

File metadata

  • Download URL: pumml-0.0.1.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for pumml-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a8a2928f9114dd012788428d6567f3d9a68defbf2672a3256ebc55bb76747634
MD5 2bf003ae6b6d86c9340971c1a060fe98
BLAKE2b-256 b11c5310859c539ebff246e24dfcf44cfca238c964d48a450a1f59fb81a9aea0

See more details on using hashes here.

File details

Details for the file pumml-0.0.1-py3.6.egg.

File metadata

  • Download URL: pumml-0.0.1-py3.6.egg
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for pumml-0.0.1-py3.6.egg
Algorithm Hash digest
SHA256 169441c03ebc35909594ae845a8c64a970ed268e04884dff47973cb6c4c45d8c
MD5 8992ccda5158af0780448ea0c964fe83
BLAKE2b-256 701945cf5148d29e195f542493e1cfde28eec9b82521d8faf9628d09c5b578f5

See more details on using hashes here.

File details

Details for the file pumml-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pumml-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for pumml-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0be780fe4667e90160d86110181d680bbcaae0bd27fa899386b474c633a307af
MD5 7d7d5bfe5928ab154e332bf56a56bd0a
BLAKE2b-256 3b98f657776bb2da020c60109f3db51b98206ac6f52fd16ad95b51e285b4cfcd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page