Skip to main content

Python 3.11+ compatible version of auto-sklearn

Project description

Auto-Sklearn2

A Python 3.11+ compatible version of auto-sklearn for automated machine learning.

Overview

Auto-Sklearn2 is a lightweight, Python 3.11+ compatible alternative to the popular auto-sklearn package. It provides automated machine learning capabilities without the dependency on ConfigSpace, which currently has compatibility issues with newer Python versions and NumPy 2.0.

Features

  • Python 3.11+ Compatible: Works with Python 3.11, 3.12, and 3.13
  • Automated Machine Learning: Automatically selects the best model and preprocessing pipeline
  • Classification and Regression: Supports both classification and regression tasks
  • Time-Limited Optimization: Set a time budget for model selection
  • Extensive Model Selection: Includes over 15 classification models and 20 regression models from scikit-learn
  • Multiple Preprocessors: Includes StandardScaler, MinMaxScaler, and RobustScaler
  • Cross-Validation: Uses cross-validation for model evaluation

Installation

pip install auto-sklearn2

Quick Start for Classification

from auto_sklearn2 import AutoSklearnClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit the auto-sklearn classifier
auto_sklearn = AutoSklearnClassifier(time_limit=120, random_state=42)
auto_sklearn.fit(X_train, y_train)

# Make predictions
y_pred = auto_sklearn.predict(X_test)

# Get the best model details
print(f"Best model: {auto_sklearn.best_params}")
print(f"Accuracy: {auto_sklearn.score(X_test, y_test):.4f}")

# Show all models performance
for model_name, score in auto_sklearn.get_models_performance().items():
    print(f"{model_name}: {score:.4f}")

Quick Start for Regression

from auto_sklearn2 import AutoSklearnRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_squared_error
import numpy as np

# Load data
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit the auto-sklearn regressor
auto_sklearn = AutoSklearnRegressor(time_limit=120, random_state=42)
auto_sklearn.fit(X_train, y_train)

# Make predictions
y_pred = auto_sklearn.predict(X_test)

# Calculate metrics
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"Best model: {auto_sklearn.best_params}")
print(f"R² Score: {r2:.4f}")
print(f"Root Mean Squared Error: {rmse:.4f}")

# Show all models performance
for model_name, score in auto_sklearn.get_models_performance().items():
    print(f"{model_name}: {score:.4f}")

Available Models

Classification Models

  • RandomForestClassifier
  • GradientBoostingClassifier
  • LogisticRegression
  • SVC and LinearSVC
  • KNeighborsClassifier
  • MLPClassifier
  • DecisionTreeClassifier
  • AdaBoostClassifier
  • ExtraTreesClassifier
  • BaggingClassifier
  • SGDClassifier
  • GaussianNB, BernoulliNB, MultinomialNB
  • QuadraticDiscriminantAnalysis
  • LinearDiscriminantAnalysis
  • And more...

Regression Models

  • RandomForestRegressor
  • GradientBoostingRegressor
  • LinearRegression
  • Ridge, Lasso, ElasticNet
  • SVR and LinearSVR
  • KNeighborsRegressor
  • MLPRegressor
  • DecisionTreeRegressor
  • AdaBoostRegressor
  • ExtraTreesRegressor
  • BaggingRegressor
  • SGDRegressor
  • HuberRegressor
  • PoissonRegressor
  • GammaRegressor
  • TweedieRegressor
  • RANSACRegressor
  • KernelRidge
  • PLSRegression
  • And more...

Differences from auto-sklearn

Auto-Sklearn2 is a simplified version of auto-sklearn with the following differences:

  1. No ConfigSpace Dependency: Uses scikit-learn's built-in models and preprocessing methods
  2. Python 3.11+ Compatible: Works with the latest Python versions
  3. No Meta-Learning: Does not use meta-learning to warm-start the optimization
  4. No Ensemble Building: Does not build ensembles of models
  5. Simpler Hyperparameter Optimization: Uses cross-validation instead of Bayesian optimization

License

BSD 3-Clause License (same as auto-sklearn)

Citation

If you use Auto-Sklearn2 in a scientific publication, please cite the original auto-sklearn paper:

@inproceedings{feurer-neurips15a,
    title     = {Efficient and Robust Automated Machine Learning},
    author    = {Feurer, Matthias and Klein, Aaron and Eggensperger, Katharina and
                 Springenberg, Jost and Blum, Manuel and Hutter, Frank},
    booktitle = {Advances in Neural Information Processing Systems 28},
    pages     = {2962--2970},
    year      = {2015}
}

Acknowledgements

This package is inspired by and based on the original auto-sklearn package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_sklearn2-1.0.0.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_sklearn2-1.0.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file auto_sklearn2-1.0.0.tar.gz.

File metadata

  • Download URL: auto_sklearn2-1.0.0.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for auto_sklearn2-1.0.0.tar.gz
Algorithm Hash digest
SHA256 8a7250aa550085a50128fd265fdd96846cd34318eaefabeb493040f499c6c8d9
MD5 cb7f9ef825f5155fc849cc4d5e055b47
BLAKE2b-256 34475c50434e821c0077b03e08e4fe7316d0898ec7ece95c1e251946eea0e98c

See more details on using hashes here.

File details

Details for the file auto_sklearn2-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: auto_sklearn2-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for auto_sklearn2-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ba2a9247af81ab8b5105ad1140dac1d84ab1e3d59c2db25ed829746e7766390
MD5 2b908678ffca18daa8971921ffe93f56
BLAKE2b-256 42160c8fbda176512abdb3060d265325d0136f372f62dc2621063d4b9aecec56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page