An algorithm-agnostic machine learning toolkit for model training, diagnostics and optimization
Project description
MLArena
An algorithm-agnostic machine learning toolkit for model training, diagnostics and optimization.
Publications
Read about the concepts and methodologies behind MLArena through these articles:
-
Algorithm-Agnostic Model Building with MLflow - Published in Towards Data Science
A foundational guide demonstrating how to build algorithm-agnostic ML pipelines using mlflow.pyfunc. The article explores creating generic model wrappers, encapsulating preprocessing logic, and leveraging MLflow's unified model representation for seamless algorithm transitions.
-
Explainable Generic ML Pipeline with MLflow - Published in Towards Data Science
An advanced implementation guide that extends the generic ML pipeline with more sophisticated preprocessing and SHAP-based model explanations. The article demonstrates how to build a production-ready pipeline that supports both classification and regression tasks, handles feature preprocessing, and provides interpretable model insights while maintaining algorithm agnosticism.
Installation
The package is undergoing rapid development at the moment (pls see CHANGELOG for details), it is therefore highly recommended to install with specific versions. For example
pip install mlarena==0.1.9
If you are using the package in Databricks ML Cluster with DBR runtime >= 15.2, you can try installing without dependencies (experimental feature):
pip install mlarena --no-deps
Usage Example
- For quick start with a basic example, see 1.basic_usage.ipynb.
- For more advanced examples, see 2.advanced_usage.ipynb.
- For visualization utilities, see 3.utils_plot.ipynb.
- For handling common challenges in machine learning, see 4.ml_discussions.ipynb.
Visual Examples:
Model Performance Analysis
Explainable ML
One liner to create global and local explaination based on shap that will work across various classification and regression algorithms.
Hyperparameter Optimization
Parallel Coordinate plot for hyperparameter search space diagnostics.
Features
-
Algorithm Agnostic ML Pipeline:
- End-to-end workflow from preprocessing to deployment
- Model-agnostic design (works with any scikit-learn compatible model), easily experiment with and swap between algorithms
- Support for both classification and regression tasks
- Early stopping and validation set support
- MLflow integration for experiment tracking and deployment
-
Intelligent Preprocessing:
- Automated feature type detection and handling
- Smart encoding recommendations based on feature cardinality and rare category
- Target encoding with visualization to support smoothing parameter selection
- Tunable drop options to optimize one-hot encoding based on model (tree vs linear) and feature type (binary vs multi-category)
- Missing value handling with configurable strategies
- Feature selection recommendations with mutual information analysis
-
Advanced Model Evaluation:
- Comprehensive metrics for both classification and regression
- Diagnostic visualization of model performance
- Threshold analysis for classification tasks
- SHAP-based model explanations (global and local)
- Cross-validation with variance penalty
-
Hyperparameter Optimization:
- Bayesian optimization with Hyperopt
- Cross-validation based tuning
- Parallel coordinates visualization for search space analysis
- Early stopping to prevent overfitting
- Variance penalty to ensure stable solutions
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlarena-0.1.9.tar.gz.
File metadata
- Download URL: mlarena-0.1.9.tar.gz
- Upload date:
- Size: 21.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.11.12 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bf0cc22ec9b9488fc9d33ad52fa96bdb7d3cddd336ed1f1ba1d00ebd4acfc05
|
|
| MD5 |
f4c9535556d1ec3f8752cfb7c733d8c7
|
|
| BLAKE2b-256 |
9f3c80720a4574070305dc77e68ac8c4cd91ef36ca1d9c7c03aca8c427587c77
|
File details
Details for the file mlarena-0.1.9-py3-none-any.whl.
File metadata
- Download URL: mlarena-0.1.9-py3-none-any.whl
- Upload date:
- Size: 21.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.11.12 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80e58e994e3b9874f3ec59aa5db91589488f20b784a0b6284b8cffc426795f26
|
|
| MD5 |
36b7faf8f86c07735250b52be18301d7
|
|
| BLAKE2b-256 |
e590b1472b49ed29b21b7fd886e223a11b868f0b0f3df3a915be147c4f7e6ac2
|