Skip to main content

A Python package for automated ML model benchmarking and comparison

Project description

AutoMLBench- Automated ML Model Benchmarking Library

📌 Overview

AutoMLBench is a Python package designed to automate the training, comparison, and evaluation of multiple machine learning models for classification, regression, and clustering tasks. It simplifies the preprocessing, model selection, and performance analysis process for both beginners and advanced users.

🚀 Features

Automated model benchmarking – Compare multiple models with minimal effort.
Flexible preprocessing – Choose between automatic or manual feature engineering.
Performance visualization – Generate insightful plots for model comparison.
Customizable feature handling – Supports missing value imputation, scaling, and encoding.
Multi-model training – Supports Random Forest, Gradient Boosting, XGBoost, LightGBM, CatBoost, and more.


📥 Installation

1️⃣ Install MLCom

git clone https://github.com/AnnNaserNabil/AutoMLBench.git
cd mlcom
pip install -r requirements.txt

2️⃣ Install AutoMLBench as a package (Optional)

pip install -e .

🛠️ Usage Guide

🔹 1. Load Dataset

You can load a dataset from a CSV file or a Pandas DataFrame.

from automlbench.data_loader import load_data

data = load_data("data.csv")  # Load from CSV file

🔹 2. Preprocess Data

MLCom allows you to either automatically process data or manually engineer features.

from automlbench.preprocess import preprocess_data

# Automatic Preprocessing
X, y = preprocess_data(data, target_column="label", auto=True)

# Manual Feature Engineering
manual_features = data.copy()
manual_features["new_feature"] = manual_features["existing_feature"] ** 2
X, y = preprocess_data(data, target_column="label", auto=False, manual_features=manual_features)

🔹 3. Split Data

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

🔹 4. Train Multiple Models

MLCom supports training and evaluating multiple machine learning models.

from automlbench.model_train import train_models

results = train_models(X_train, y_train, X_test, y_test)

🔹 5. Compare Models

from automlbench.model_compare import compare_models

compare_models(results)

🔹 6. Visualize Performance

from automlbench.visualization import plot_results

plot_results(results)

📌 Supported Models

MLCom supports the following machine learning models out of the box:

  • Tree-based models: Random Forest, Gradient Boosting, XGBoost, LightGBM, CatBoost, Extra Trees, AdaBoost, Decision Tree
  • Linear models: Logistic Regression
  • Support Vector Machines: SVC
  • Instance-based learning: K-Nearest Neighbors (KNN)
  • Naive Bayes Classifier
  • Neural Networks: Multi-layer Perceptron (MLPClassifier)

🔥 Example: Using MLCom with the California Housing Prices Dataset

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
from automlbench.preprocess import preprocess_data
from automlbench.model_train import train_models
from automlbench.model_compare import compare_models
from automlbench.visualization import plot_results

# Load dataset
data = fetch_california_housing()
california_df = pd.DataFrame(data.data, columns=data.feature_names)
california_df['target'] = data.target  # House price target variable

# Preprocess data
X, y = preprocess_data(california_df, target_column='target', auto=True)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train models
results = train_models(X_train, y_train, X_test, y_test)

# Compare models
compare_models(results)

# Visualize results
plot_results(results)

🧪 Running Tests

To ensure everything is working correctly, run the test suite:

pytest tests/

🔄 Future Enhancements

  • Deep learning support (TensorFlow & PyTorch models)
  • Custom metric selection (AUC, precision, recall, RMSE, etc.)
  • Hyperparameter tuning integration
  • Web-based UI for interactive benchmarking

📜 Conclusion

MLCom is a powerful tool for benchmarking machine learning models effortlessly. It automates model training, comparison, and visualization while providing flexibility for advanced feature engineering.

🚀 Try AutoMLBench now and simplify your ML workflows! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automlbench-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AutoMLBench-0.1.0-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file automlbench-0.1.0.tar.gz.

File metadata

  • Download URL: automlbench-0.1.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.7

File hashes

Hashes for automlbench-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3b67dfae503ffd5a555ac35e8277f8b83403e422cc469b7cdb4f3dcdac9a8ae0
MD5 3a727dfbfdf508a51ffa6c98960cb93d
BLAKE2b-256 a8989a711ac961cf4125777fe9236847a15b4b98034371e053e96b79aa40ff63

See more details on using hashes here.

File details

Details for the file AutoMLBench-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: AutoMLBench-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.7

File hashes

Hashes for AutoMLBench-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dbc778feb8c44705ecf30079c4d5465a7e29c9ed17f0dd793cecd1f827432e6c
MD5 953a5a45dffc62c6ed354f2d2fde1b53
BLAKE2b-256 7f861ad679d98cfe1165134934b6324d53953c84f78db0e1206f9e0d36196c97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page