Skip to main content

Machine Automated Modelling and Utility Toolkit

Project description

MAMUT Logo

MAMUT

Machine Automated Modelling and Utility Toolkit for tabular classification.

Documentation Status Test Pipeline Pre-commit Pipeline License

Overview

MAMUT is a Python toolkit that automates model selection and evaluation for classification tasks on tabular data. It bundles preprocessing, Optuna-driven hyperparameter optimization, model comparison, and reporting into a single workflow built on scikit-learn and XGBoost.

Key Features

  • End-to-end preprocessing: missing values, categorical encoding, skew correction, scaling, outlier filtering, imbalance handling (SMOTE/undersampling/SMOTETomek), optional feature selection, and PCA.
  • Model search across common classifiers (LogisticRegression, RandomForestClassifier, SVC, XGBClassifier, MLPClassifier, GaussianNB, KNeighborsClassifier).
  • Hyperparameter optimization with Optuna (TPE/Bayesian or random search).
  • Report generation via evaluate() with metrics, plots, and SHAP explanations.
  • Saved artifacts: fit() stores fitted models; evaluate() writes an HTML report and plots to disk.

Installation

Python 3.12 is the target runtime (see .python-version).

From PyPI:

pip install mamut

From source:

pip install -e .

For development with Poetry:

poetry install

Quickstart

from sklearn.datasets import load_iris
from mamut.wrapper import Mamut

X, y = load_iris(as_frame=True, return_X_y=True)

mamut = Mamut(n_iterations=5, optimization_method="bayes")
mamut.fit(X, y)

preds = mamut.predict(X)
proba = mamut.predict_proba(X)

Configuration Notes

  • With preprocessing enabled (default), pass X as a pandas DataFrame and y as a Series.
  • Targets must be categorical (float targets raise a ValueError).
  • fit() performs a stratified 80/20 train/test split controlled by random_state.
  • Select the optimization strategy with optimization_method="bayes" or "random_search".
  • Control the search budget with n_iterations.
  • Exclude models by class name (e.g., exclude_models=["SVC"]).
  • Preprocessing options are passed directly into Mamut(...) (e.g., pca=True, feature_selection=True, num_imputation="knn").
  • score_metric expects one of: accuracy, precision, recall, f1, balanced_accuracy, jaccard, roc_auc_score.

Outputs and Reports

  • mamut.best_model_: best performing pipeline after fit.
  • mamut.training_summary_: per-model scores and timings.
  • mamut.optuna_studies_: Optuna studies keyed by model name.
  • mamut.evaluate(): writes an HTML report to ./mamut_report/report_<timestamp>.html and stores plots in ./mamut_report/plots/.
  • mamut.save_best_model(path): writes the best model to an existing directory as a .joblib file.
  • fit() saves all fitted models to ./fitted_models/<timestamp>/ as .joblib files.

Development

poetry run pytest
poetry run pre-commit run --all-files
make -C docs html

Examples and Docs

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mamut-0.1.2.tar.gz (419.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mamut-0.1.2-py3-none-any.whl (418.7 kB view details)

Uploaded Python 3

File details

Details for the file mamut-0.1.2.tar.gz.

File metadata

  • Download URL: mamut-0.1.2.tar.gz
  • Upload date:
  • Size: 419.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.18.5-200.fc43.x86_64

File hashes

Hashes for mamut-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ba53d367362ee0e79ffe425988231064e74667610a1d5913cd9ac1cf9342451c
MD5 7fe4ad175fb55333995363093a3304e4
BLAKE2b-256 f7fdbca8bf8b2e7fbd77d690cd1d8a235150145f5316e645a467dc13cbc6b758

See more details on using hashes here.

File details

Details for the file mamut-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mamut-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 418.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.1 CPython/3.12.12 Linux/6.18.5-200.fc43.x86_64

File hashes

Hashes for mamut-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5f252ec082525263005d9804fcf3d1d0c0c69aa49d690cca0181e4c8d0417d87
MD5 b4cfdce76b2118ffadd0d48cc59a8119
BLAKE2b-256 76efe946be7d8f8e3d8fb89a30cd5a0c2b5be9fd12f9c4e2d868badd93413db4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page