automlbench

A Python package for automated ML model benchmarking and comparison

These details have not been verified by PyPI

Project links

Homepage

Project description

Automated Machine Learning Benchmarking Library

📌 AutoMLBench is a Python library designed to automate the machine learning pipeline, including:

Data loading
Preprocessing
Model training
Evaluation
Performance visualization

Installation

Ensure you have all the necessary dependencies installed:

pip install pandas scikit-learn numpy matplotlib xgboost lightgbm catboost imbalanced-learn

For local development, clone the repository and install it in editable mode:

git clone https://github.com/your-repo/AutoMLBench.git
cd AutoMLBench
pip install -e .

Modules Overview

AutoMLBench consists of several modules, each handling a specific part of the ML pipeline.

Module	Functionality
`data_loader.py`	Loads data from multiple file formats (`CSV`, `Excel`, `JSON`, `Parquet`, `HDF5`).
`preprocessing.py`	Handles missing values, feature scaling, and categorical encoding.
`models.py`	Provides predefined machine learning models (Random Forest, XGBoost, LightGBM, etc.).
`model_train.py`	Trains multiple models with class balancing and metric evaluation.
`hyperparameter_tuning.py`	Uses `GridSearchCV` for hyperparameter optimization.
`evaluation.py`	Computes performance metrics (Accuracy, Precision, Recall, F1-Score, AUC-ROC).
`visualization.py`	Generates performance comparison plots.
`utils.py`	Provides logging and execution time utilities.
`__init__.py`	Exposes core functionalities for easy import.

Usage Guide

1️⃣ Load the Dataset

AutoMLBench supports direct loading of datasets.

import pandas as pd
from automlbench import load_data

# Load Titanic dataset
url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

2️⃣ Preprocess the Data

from automlbench import preprocess_data

# Define the target column
target_column = "Survived"

# Preprocess dataset (handles missing values, encoding, scaling)
X_train, X_test, y_train, y_test = preprocess_data(df, target_column)

3️⃣ Train Machine Learning Models

from automlbench import get_models, train_models

# Get predefined models
models = get_models()

# Train all models
results = train_models(X_train, X_test, y_train, y_test)

# Display model results
print(results)

4️⃣ Evaluate Model Performance

from automlbench import evaluate_model

# Evaluate a specific model (e.g., Random Forest)
rf_model = models["Random Forest"].fit(X_train, y_train)
metrics = evaluate_model(rf_model, X_test, y_test)

print(metrics)

5️⃣ Visualize Model Performance

from automlbench import plot_performance

# Plot model comparison for multiple metrics
plot_performance(results, metrics=["Accuracy", "Precision", "Recall", "F1-Score", "RMSE"])

6️⃣ Hyperparameter Tuning (Optional)

If you want to fine-tune a model:

from automlbench import tune_hyperparameters

# Define hyperparameter grid
param_grid = {
    "n_estimators": [50, 100, 200],
    "max_depth": [None, 10, 20]
}

# Tune the Random Forest model
best_model, best_params = tune_hyperparameters(models["Random Forest"], param_grid, X_train, y_train)

print(f"Best Model: {best_model}")
print(f"Best Parameters: {best_params}")

Example Workflow

Here's a full end-to-end pipeline using AutoMLBench:

import pandas as pd
from automlbench import (
    preprocess_data, get_models, train_models, evaluate_model, plot_performance
)

# Load dataset
url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

# Preprocess data
X_train, X_test, y_train, y_test = preprocess_data(df, "Survived")

# Train models
results = train_models(X_train, X_test, y_train, y_test)

# Display evaluation metrics
for model_name, model in get_models().items():
    model.fit(X_train, y_train)
    print(f"{model_name} Metrics:", evaluate_model(model, X_test, y_test))

# Plot performance
plot_performance(results)

Troubleshooting

Common Issues & Fixes

❌ ImportError: cannot import name 'train_models'

✔ Fix: Ensure train_models is listed in __init__.py:

from .model_train import train_models

❌ ModuleNotFoundError: No module named 'automlbench'

✔ Fix: Reinstall the package in editable mode:

pip install -e .

❌ ValueError: The target variable must contain at least two classes

✔ Fix: Ensure the dataset has at least two unique classes in the target column.

Future Improvements

✅ Ensemble Model Support
✅ Feature Selection Methods
✅ AutoML Integration (e.g., with Optuna, Hyperopt)
✅ Support for Regression Models

Contributing

We welcome contributions! To contribute:

Fork the repository
Create a new branch (feature-branch)
Make changes and test
Submit a pull request (PR)

License

AutoMLBench is released under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.5.1

Mar 5, 2025

0.1.5

Mar 5, 2025

0.1.4

Mar 5, 2025

This version

0.1.3

Mar 4, 2025

0.1.2

Mar 4, 2025

0.1.1

Mar 4, 2025

0.1.0

Mar 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automlbench-0.1.3.tar.gz (9.6 kB view details)

Uploaded Mar 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

automlbench-0.1.3-py3-none-any.whl (9.8 kB view details)

Uploaded Mar 4, 2025 Python 3

File details

Details for the file automlbench-0.1.3.tar.gz.

File metadata

Download URL: automlbench-0.1.3.tar.gz
Upload date: Mar 4, 2025
Size: 9.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.7

File hashes

Hashes for automlbench-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`0573e361a2829e0d5071afcbcc1ffd906722a0c19e2edb481d240e9f9d4cf058`
MD5	`25063eeeb6ccb2de6cfbaa5e173912f6`
BLAKE2b-256	`097eb5a432a16f6639169f87ac4c7540b1c00cf335c7583706eb0b163de3d782`

See more details on using hashes here.

File details

Details for the file automlbench-0.1.3-py3-none-any.whl.

File metadata

Download URL: automlbench-0.1.3-py3-none-any.whl
Upload date: Mar 4, 2025
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.7

File hashes

Hashes for automlbench-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d1a0ed635825817e62f3c0863cd64d35605252bb8db43baffb0f298afc624fc0`
MD5	`ad6fa8952b4cf67836b50ba9d3157957`
BLAKE2b-256	`a6f0f8ce5079177d913190bc2121153760b656e7699b3cfc02129afb92ba1201`

See more details on using hashes here.

automlbench 0.1.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Automated Machine Learning Benchmarking Library

Installation

Modules Overview

Usage Guide

1️⃣ Load the Dataset

2️⃣ Preprocess the Data

3️⃣ Train Machine Learning Models

4️⃣ Evaluate Model Performance

5️⃣ Visualize Model Performance

6️⃣ Hyperparameter Tuning (Optional)

Example Workflow

Troubleshooting

Common Issues & Fixes

❌ ImportError: cannot import name 'train_models'

❌ ModuleNotFoundError: No module named 'automlbench'

❌ ValueError: The target variable must contain at least two classes

Future Improvements

Contributing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes