A NOMAD plugin for managing ML workflows.
Project description
nomad-ml-workflows
A NOMAD plugin for managing ML workflows. Currently, it provides an action to export large number of entries from NOMAD database as tabular data files. Other ML workflow related actions and schemas will be added in future.
📦 Installation
You can install the plugin using pip:
pip install nomad-ml-workflows @ git+https://github.com/FAIRmat-NFDI/nomad-ml-workflows.git
However, to fully utilize the plugin, you need to add it to your NOMAD instance as described below.
✨ Features
-
Export a large number of NOMAD entries as tabular data files (CSV, Parquet) using NOMAD Actions. Once the action is triggered, it will:
- Search entries based on user-defined criteria.
- Optionally include or exclude data fields from the entries.
- Package the entries into tabular data files like CSV or Parquet (or as JSON)
- Export the files to a specified Project (or previously known as Upload) in NOAMD.
These can then be downloaded from the NOMAD web interface for local use.
⚙️ Configuration
The Export Entries action can be configured using the following parameters in
the nomad.yaml configuration file of your NOMAD Oasis instance:
plugins:
entry_points:
options:
nomad_ml_workflows.actions:export_entries:
search_batch_timeout: 7200
# Timeout (in seconds) for each search batch in the Export Entries
# action. Set this accordingly to time out longer searches.
max_entries_export_limit: 100000
# Maximum number of entries that can be exported in a single
# Export Entries action.
🚀 Adding this plugin to NOMAD
Currently, NOMAD has two distinct flavors that are relevant depending on your role as an user:
- A NOMAD Oasis: any user with a NOMAD Oasis instance.
- Local NOMAD installation and the source code of NOMAD: internal developers.
Adding this plugin in your NOMAD Oasis
Read the NOMAD plugin documentation for all details on how to deploy the plugin on your NOMAD instance.
Adding this plugin in your local NOMAD installation and the source code of NOMAD
We now recommend using the dedicated nomad-distro-dev repository to simplify the process. Please refer to that repository for detailed instructions.
🛠️ Development
If you want to develop locally this plugin, clone the project and in the plugin folder, create a virtual environment (you can use Python 3.10, 3.11 or 3.12):
git clone https://github.com/FAIRmat-NFDI/nomad-ml-workflows.git
cd nomad-ml-workflows
python3.11 -m venv .pyenv
. .pyenv/bin/activate
Make sure to have pip upgraded:
pip install --upgrade pip
We recommend installing uv for fast pip installation of the packages:
pip install uv
Install the nomad-lab package:
uv pip install -e '.[dev]'
Run linting and auto-formatting
We use Ruff for linting and formatting the code. Ruff auto-formatting is also a part of the GitHub workflow actions. You can run locally:
ruff check .
ruff format . --check
Debugging
For interactive debugging of the tests, use pytest with the --pdb flag. We recommend using an IDE for debugging, e.g., VSCode. If that is the case, add the following snippet to your .vscode/launch.json:
{
"configurations": [
{
"name": "<descriptive tag>",
"type": "debugpy",
"request": "launch",
"cwd": "${workspaceFolder}",
"program": "${workspaceFolder}/.pyenv/bin/pytest",
"justMyCode": true,
"env": {
"_PYTEST_RAISE": "1"
},
"args": [
"-sv",
"--pdb",
"<path-to-plugin-tests>",
]
}
]
}
where <path-to-plugin-tests> must be changed to the local path to the test module to be debugged.
The settings configuration file .vscode/settings.json automatically applies the linting and formatting upon saving the modified file.
Documentation on Github pages
To view the documentation locally, install the related packages using:
uv pip install -r requirements_docs.txt
Run the documentation server:
mkdocs serve
👥 Main contributors
| Name | |
|---|---|
| Sarthak Kapoor | sarthak.kapoor@physik.hu-berlin.de |
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nomad_ml_workflows-0.0.6.tar.gz.
File metadata
- Download URL: nomad_ml_workflows-0.0.6.tar.gz
- Upload date:
- Size: 112.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1f15e8b23ec8c261815ee7a96c9cf6a88c0a1c650d08d7f053b3189836cf9cc
|
|
| MD5 |
74e4dafb3620a5b44ec8ba6f1fa91373
|
|
| BLAKE2b-256 |
cc3b07e99a72d6708e1deb8f13ae9b3db3aefe05aa03da0663fae672740d45cf
|
File details
Details for the file nomad_ml_workflows-0.0.6-py3-none-any.whl.
File metadata
- Download URL: nomad_ml_workflows-0.0.6-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
650e4777befd51ab31864ce7e66e46ac0744e907f457bbbd499bd14f42da3d55
|
|
| MD5 |
376e8dc63eabbaad61db14033d6bb037
|
|
| BLAKE2b-256 |
b16edfe3a31b070cb7cbbc8ecee4fe65318749733b220cd2043e5b0e4c7b0757
|