Skip to main content

Machine Automated Modelling and Utility Toolkit

Project description

MAMUT Logo

MAMUT

Machine Automated Modelling and Utility Toolkit for tabular classification.

Documentation Status Test Pipeline Pre-commit Pipeline License

Overview

MAMUT is a Python toolkit that automates model selection and evaluation for classification tasks on tabular data. It bundles preprocessing, Optuna-driven hyperparameter optimization, model comparison, and reporting into a single workflow built on scikit-learn and XGBoost.

Key Features

  • End-to-end preprocessing: missing values, categorical encoding, skew correction, scaling, outlier filtering, imbalance handling (SMOTE/undersampling/SMOTETomek), optional feature selection, and PCA.
  • Model search across common classifiers (LogisticRegression, RandomForestClassifier, SVC, XGBClassifier, MLPClassifier, GaussianNB, KNeighborsClassifier).
  • Hyperparameter optimization with Optuna (TPE/Bayesian or random search).
  • Report generation via evaluate() with metrics, plots, and SHAP explanations.
  • Saved artifacts: fit() stores fitted models; evaluate() writes an HTML report and plots to disk.

Installation

Python 3.12 is the target runtime (see .python-version).

From PyPI:

pip install mamut

From source:

pip install -e .

For development with Poetry:

poetry install

Quickstart

from sklearn.datasets import load_iris
from mamut.wrapper import Mamut

X, y = load_iris(as_frame=True, return_X_y=True)

mamut = Mamut(n_iterations=5, optimization_method="bayes")
mamut.fit(X, y)

preds = mamut.predict(X)
proba = mamut.predict_proba(X)

Configuration Notes

  • With preprocessing enabled (default), pass X as a pandas DataFrame and y as a Series.
  • Targets must be categorical (float targets raise a ValueError).
  • fit() performs a stratified 80/20 train/test split controlled by random_state.
  • Select the optimization strategy with optimization_method="bayes" or "random_search".
  • Control the search budget with n_iterations.
  • Exclude models by class name (e.g., exclude_models=["SVC"]).
  • Preprocessing options are passed directly into Mamut(...) (e.g., pca=True, feature_selection=True, num_imputation="knn").
  • score_metric expects one of: accuracy, precision, recall, f1, balanced_accuracy, jaccard, roc_auc_score.

Outputs and Reports

  • mamut.best_model_: best performing pipeline after fit.
  • mamut.training_summary_: per-model scores and timings.
  • mamut.optuna_studies_: Optuna studies keyed by model name.
  • mamut.evaluate(): writes an HTML report to ./mamut_report/report_<timestamp>.html and stores plots in ./mamut_report/plots/.
  • mamut.save_best_model(path): writes the best model to an existing directory as a .joblib file.
  • fit() saves all fitted models to ./fitted_models/<timestamp>/ as .joblib files.

Development

poetry run pytest
poetry run pre-commit run --all-files
make -C docs html

Examples and Docs

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mamut-0.1.1.tar.gz (419.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mamut-0.1.1-py3-none-any.whl (418.6 kB view details)

Uploaded Python 3

File details

Details for the file mamut-0.1.1.tar.gz.

File metadata

  • Download URL: mamut-0.1.1.tar.gz
  • Upload date:
  • Size: 419.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.18.5-200.fc43.x86_64

File hashes

Hashes for mamut-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a3d86da25ba6f1f79376be5a6a3214bd92aa756ac90cd38fd3295707e284f9fb
MD5 8909355d95ddc132d9d2c4d33b82a571
BLAKE2b-256 3151eceb579d92cace91886bf2c825e1f388a7584e8066483ad3f914c364267f

See more details on using hashes here.

File details

Details for the file mamut-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mamut-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 418.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.18.5-200.fc43.x86_64

File hashes

Hashes for mamut-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e1e419d3a1ab509a3762bf503ac18d18dfe45ac4655f9d3dc9d1323110c0680
MD5 f303b11a317cbb8de2d17e1b555cdc9f
BLAKE2b-256 03a908c31738bfbcbe707b07d30e5c6455cea764d9bc2f9c3c53a9f196b7f9a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page