Generative materials benchmarking metrics, inspired by CDVAE.
Project description
NOTE: This is a WIP repository (as of 2022-06-21) being developed in parallel with
xtal2png
. Feedback and contributions welcome!
matbench-genmetrics
Generative materials benchmarking metrics, inspired by CDVAE.
This repository provides standardized benchmarks for benchmarking generative models for crystal structure. Each benchmark has a fixed dataset, a predefined split, and a notion of best (i.e. metric) associated with it.
Getting Started
Installation
Create a conda environment with the matbench-genmetrics
package installed from the
conda-forge
channel. Then activate the environment.
conda create --name matbench-genmetrics --channel conda-forge python==3.9.* matbench-genmetrics
conda activate matbench-genmetrics
NOTE: It doesn't have to be Python 3.9; you can remove
python==3.9.*
altogether or change this to e.g.python==3.8.*
. See Advanced Installation
Basic Usage
from mp_time_split.utils.gen import DummyGenerator
from matbench_genmetrics.core import MPTSMetrics
mptm = MPTSMetrics(dummy=False)
for fold in mptm.folds:
train_val_inputs = mptm.get_train_and_val_data(fold)
dg = DummyGenerator()
dg.fit(train_val_inputs)
gen_structures = dg.gen(n=10000)
mptm.record(fold, gen_structures)
print(mptm.recorded_metrics)
Advanced Installation
In order to set up the necessary environment:
- review and uncomment what you need in
environment.yml
and create an environmentmatbench-genmetrics
with the help of conda:conda env create -f environment.yml
- activate the new environment with:
conda activate matbench-genmetrics
NOTE: The conda environment will have matbench-genmetrics installed in editable mode. Some changes, e.g. in
setup.cfg
, might require you to runpip install -e .
again.
Optional and needed only once after git clone
:
-
install several pre-commit git hooks with:
pre-commit install # You might also want to run `pre-commit autoupdate`
and checkout the configuration under
.pre-commit-config.yaml
. The-n, --no-verify
flag ofgit commit
can be used to deactivate pre-commit hooks temporarily. -
install nbstripout git hooks to remove the output cells of committed notebooks with:
nbstripout --install --attributes notebooks/.gitattributes
This is useful to avoid large diffs due to plots in your notebooks. A simple
nbstripout --uninstall
will revert these changes.
Then take a look into the scripts
and notebooks
folders.
Dependency Management & Reproducibility
- Always keep your abstract (unpinned) dependencies updated in
environment.yml
and eventually insetup.cfg
if you want to ship and install your package viapip
later on. - Create concrete dependencies as
environment.lock.yml
for the exact reproduction of your environment with:conda env export -n matbench-genmetrics -f environment.lock.yml
For multi-OS development, consider using--no-builds
during the export. - Update your current environment with respect to a new
environment.lock.yml
using:conda env update -f environment.lock.yml --prune
Project Organization
├── AUTHORS.md <- List of developers and maintainers.
├── CHANGELOG.md <- Changelog to keep track of new features and fixes.
├── CONTRIBUTING.md <- Guidelines for contributing to this project.
├── Dockerfile <- Build a docker container with `docker build .`.
├── LICENSE.txt <- License as chosen on the command-line.
├── README.md <- The top-level README for developers.
├── configs <- Directory for configurations of model & application.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── docs <- Directory for Sphinx documentation in rst or md.
├── environment.yml <- The conda environment file for reproducibility.
├── models <- Trained and serialized models, model predictions,
│ or model summaries.
├── notebooks <- Jupyter notebooks. Naming convention is a number (for
│ ordering), the creator's initials and a description,
│ e.g. `1.0-fw-initial-data-exploration`.
├── pyproject.toml <- Build configuration. Don't change! Use `pip install -e .`
│ to install for development or to build `tox -e build`.
├── references <- Data dictionaries, manuals, and all other materials.
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated plots and figures for reports.
├── scripts <- Analysis and production scripts which import the
│ actual PYTHON_PKG, e.g. train_model.
├── setup.cfg <- Declarative configuration of your project.
├── setup.py <- [DEPRECATED] Use `python setup.py develop` to install for
│ development or `python setup.py bdist_wheel` to build.
├── src
│ └── matbench_genmetrics <- Actual Python package where the main functionality goes.
├── tests <- Unit tests which can be run with `pytest`.
├── .coveragerc <- Configuration for coverage reports of unit tests.
├── .isort.cfg <- Configuration for git hook that sorts imports.
└── .pre-commit-config.yaml <- Configuration of pre-commit git hooks.
Note
This project has been set up using PyScaffold 4.2.2.post1.dev2+ge50b5e1 and the dsproject extension 0.7.2.post1.dev2+geb5d6b6.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for matbench-genmetrics-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 231a4e127f9f9be8295eb9c4b863d07d453f4ed35ae5f58e34ef2c28b820917f |
|
MD5 | 3c04fc8af72b2b791429a05a87d48187 |
|
BLAKE2b-256 | 9f651c3f5bbe20f784cf7c7b8e4f883a73daa0e0464636c16a35a2d355c0f8e6 |
Hashes for matbench_genmetrics-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 288c3f548149edc78a66797dc3093b82f4d8c1bc8e12fb58602ee37f330200a8 |
|
MD5 | 68af27a0db44b3af6bb86a35e49ee2a6 |
|
BLAKE2b-256 | 70cec3ff1fd3a9b8befd714031aa16e86b0630459437f1516aaa62afcc18ff8e |