Tabular classification benchmarking toolkit for model selection, repeated stratified cross-validation, final model export, and artifact-based inference.

These details have not been verified by PyPI

Project description

MELITE: Multi-model Evaluation and Learning for Inference-ready Tabular Experiments

MELITE is a pre-stable Python toolkit for tabular classification benchmarking, model selection, repeated stratified cross-validation, final model export, and artifact-based inference.

MELITE is tabular at the modeling level. The learning algorithms consume numeric X and y arrays, so the feature matrix may come from PCA, UMAP, fingerprints, descriptors, clinical variables, experimental measurements, industrial features, or manually selected numeric features.

Project Identity

Project: MELITE
PyPI distribution: melite
Import package: melite
CLI: melite
Version: 0.2.3
License: LGPL-3.0-or-later
Status: alpha / pre-stable

Documentation

The live documentation is published at:

https://nanobiostructuresrg.github.io/melite/

Key pages:

Installation

After PyPI publication:

python -m pip install melite

For local development:

git clone https://github.com/NanoBiostructuresRG/melite.git
cd melite
python -m pip install -e .

For development and documentation tools:

python -m pip install -e ".[dev]"
python -m pip install -e ".[docs]"

Quick Start

Run a fast smoke benchmark with the bundled synthetic example dataset:

melite run --smoke --config examples/example_config.toml

Export a selected model artifact:

melite export --row 0 --csv examples/output/results.csv --outdir examples/output/

Run artifact-based inference:

import numpy as np
from melite import predict

X_new = np.load("examples/sample_PCA70.npz")["X"]
result = predict("examples/output/Model_SVC_sample_pca70.pkl", X_new)
print(result["predictions"])
print(result["probabilities"])

Scope

MELITE does	MELITE does not
Accept prepared `X` and `y` arrays.	Generate fingerprints.
Benchmark SVC, Random Forest, XGBoost, and opt-in experimental stacking classifiers.	Process SMILES.
Select the best row by F1-macro.	Generate PCA or UMAP reductions from raw data.
Export a final retrained `.pkl` model.	Act as a general AutoML framework.
Run artifact-based inference through `predict()`.	Promise a stable 1.0 API yet.
Handle any numeric tabular matrix.	Generate or validate domain-specific descriptors.

Datasets are registered as concrete tabular matrix candidates under [datasets.<dataset_id>]. The dataset_id is user-defined and is used in results.csv, figures, and exported model filenames.

[datasets.morgan_r2_2048]
path = "data/morgan_r2_2048.npz"
label_path = "raw/labels.npy"
family = "fingerprints"
method = "Morgan"
variant = "r2_2048"

[datasets.rdkit_descriptors]
path = "data/rdkit_descriptors.npz"
label_path = "raw/labels.npy"
family = "descriptors"
method = "RDKit"

[datasets.pca85]
path = "data/PCA85.npz"
label_path = "raw/labels.npy"
family = "dimensionality"
method = "PCA"
level = 85

Each registered dataset must define path and label_path. Optional metadata fields are family, method, variant, level, and description; they are reported for traceability and do not drive special-case model execution. Registered datasets are loaded strictly: missing files, missing X, non-2D or non-numeric X, length mismatches, and embedded y mismatches fail the run. Legacy [benchmark].reduction_types and levels configs are still accepted and are normalized into equivalent dataset entries such as PCA70 and UMAP90.

Model families are controlled by [models].active:

[models]
active = ["svc", "rf", "xgb"]

Remove a key to skip that family during training. Valid keys are svc, rf, xgb, and experimental stack. Stacking is opt-in; add "stack" to active to evaluate an sklearn StackingClassifier alongside the default families.

Standalone SVC is trained and exported as a StandardScaler -> SVC sklearn pipeline because SVM/kernel-based methods are sensitive to feature scale. Random Forest and XGBoost are tree-based models and remain unscaled by default. Experimental stacking uses stack_method="predict_proba" with a scaled probabilistic SVC base estimator, unscaled RF/XGBoost base estimators, and a logistic regression final estimator. Its internal stacking CV uses the configured split count and random state with one repeat to satisfy sklearn's out-of-fold prediction requirements. Final exports remain .pkl artifacts serialized with joblib; Optuna and MLflow are not part of v0.2.3.

CLI

melite --help
melite run --help
melite export --help
melite --version

Common commands:

melite run
melite run --smoke
melite run --config my_config.toml
melite export --row 0
melite export --config my_config.toml --row 0
melite export --row 0 --force

Public API

from melite import Config
from melite import load_datasets
from melite import plot_cv_distributions
from melite import predict
from melite import __version__

Modules not listed above are importable directly but are not part of the public contract and may change before 1.0.

Input Format

raw/labels.npy          <- target vector y, shape (n_samples,)
data/morgan_r2_2048.npz <- required key: X, optional key: y
data/rdkit_descriptors.npz
data/PCA85.npz
data/UMAP90.npz

Each .npz file must contain an X array. If an embedded y array is present, MELITE validates it against the configured label_path.

Outputs

output/
|-- results.txt
|-- results.csv
|-- Model_<model>_<dataset>.pkl
`-- figures/
    `-- <model>_<dataset>.png

Local inputs and generated artifacts such as raw/, data/, output/, .pkl, and .joblib files are intentionally ignored by Git.

Validation

The current dev/v0.2.3 branch targets:

python -m pytest tests/ -v --basetemp=.review_pytest_tmp -o cache_dir=.review_pytest_cache
mkdocs build --strict
python -m build --no-isolation
python -m twine check dist/*
python scripts/smoke_install_wheel.py
melite --help
melite run --help
melite export --help
melite --version

Citation

If you use MELITE in your research, please cite it using the metadata in CITATION.cff.

Contreras-Torres, F. F., & Murrieta, A. C. (2026). MELITE: Multi-model Evaluation and Learning for Inference-ready Tabular Experiments. Zenodo. https://doi.org/10.5281/zenodo.20382752

Authors

Developed by Flavio F. Contreras-Torres. Tecnologico de Monterrey

Co-author: Ana C. Murrieta. Tecnologico de Monterrey

License

This project is licensed under the terms of the GNU Lesser General Public License v3.0 or later.

SPDX identifier: LGPL-3.0-or-later

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.3

Jun 1, 2026

0.2.2

May 28, 2026

0.2.1

May 28, 2026

0.2.0

May 27, 2026

0.1.11

May 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melite-0.2.3.tar.gz (57.7 kB view details)

Uploaded Jun 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

melite-0.2.3-py3-none-any.whl (45.8 kB view details)

Uploaded Jun 1, 2026 Python 3

File details

Details for the file melite-0.2.3.tar.gz.

File metadata

Download URL: melite-0.2.3.tar.gz
Upload date: Jun 1, 2026
Size: 57.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for melite-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`5051689ed71fa0711d197f6f18db1b181a9d6275c0a8ef18b6605546f22c107e`
MD5	`07f93b2fddab4b4464757107cc0c38d0`
BLAKE2b-256	`c72fc226be75575f164248d17c6ff95f97a77154b92d13d681738fc6ae5a04d1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for melite-0.2.3.tar.gz:

Publisher: publish-to-pypi.yml on NanoBiostructuresRG/melite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: melite-0.2.3.tar.gz
- Subject digest: 5051689ed71fa0711d197f6f18db1b181a9d6275c0a8ef18b6605546f22c107e
- Sigstore transparency entry: 1695282056
- Sigstore integration time: Jun 1, 2026
Source repository:
- Permalink: NanoBiostructuresRG/melite@9c11ebded2f92d43de10f990d829185651e4810f
- Branch / Tag: refs/tags/v0.2.3
- Owner: https://github.com/NanoBiostructuresRG
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@9c11ebded2f92d43de10f990d829185651e4810f
- Trigger Event: workflow_dispatch

File details

Details for the file melite-0.2.3-py3-none-any.whl.

File metadata

Download URL: melite-0.2.3-py3-none-any.whl
Upload date: Jun 1, 2026
Size: 45.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for melite-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a3c65da916d5d902dbcd5f4ee0698524285974dd20d188e05d0c524e58850a8`
MD5	`0fbc5d01d2c58fbf35b00f90fdc936a8`
BLAKE2b-256	`b74fb2fd909f27923d4ff3dde8dd098713c4037d5b191ad8749e4b2c6d5003c0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for melite-0.2.3-py3-none-any.whl:

Publisher: publish-to-pypi.yml on NanoBiostructuresRG/melite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: melite-0.2.3-py3-none-any.whl
- Subject digest: 4a3c65da916d5d902dbcd5f4ee0698524285974dd20d188e05d0c524e58850a8
- Sigstore transparency entry: 1695282151
- Sigstore integration time: Jun 1, 2026
Source repository:
- Permalink: NanoBiostructuresRG/melite@9c11ebded2f92d43de10f990d829185651e4810f
- Branch / Tag: refs/tags/v0.2.3
- Owner: https://github.com/NanoBiostructuresRG
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@9c11ebded2f92d43de10f990d829185651e4810f
- Trigger Event: workflow_dispatch

melite 0.2.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

MELITE: Multi-model Evaluation and Learning for Inference-ready Tabular Experiments

Project Identity

Documentation

Installation

Quick Start

Scope

CLI

Public API

Input Format

Outputs

Validation

Citation

Authors

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance