Python 3.11+ compatible version of auto-sklearn
Project description
Auto-Sklearn2
A Python 3.11+ compatible version of auto-sklearn for automated machine learning.
Overview
Auto-Sklearn2 is a lightweight, Python 3.11+ compatible alternative to the popular auto-sklearn package. It provides automated machine learning capabilities without the dependency on ConfigSpace, which currently has compatibility issues with newer Python versions and NumPy 2.0.
Features
- Python 3.11+ Compatible: Works with Python 3.11, 3.12, and 3.13
- Automated Machine Learning: Automatically selects the best model and preprocessing pipeline
- Classification and Regression: Supports both classification and regression tasks
- Time-Limited Optimization: Set a time budget for model selection
- Extensive Model Selection: Includes over 15 classification models and 20 regression models from scikit-learn
- Multiple Preprocessors: Includes StandardScaler, MinMaxScaler, and RobustScaler
- Cross-Validation: Uses cross-validation for model evaluation
Installation
pip install auto-sklearn2
Quick Start for Classification
from auto_sklearn2 import AutoSklearnClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
# Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and fit the auto-sklearn classifier
auto_sklearn = AutoSklearnClassifier(time_limit=120, random_state=42)
auto_sklearn.fit(X_train, y_train)
# Make predictions
y_pred = auto_sklearn.predict(X_test)
# Get the best model details
print(f"Best model: {auto_sklearn.best_params}")
print(f"Accuracy: {auto_sklearn.score(X_test, y_test):.4f}")
# Show all models performance
for model_name, score in auto_sklearn.get_models_performance().items():
print(f"{model_name}: {score:.4f}")
Quick Start for Regression
from auto_sklearn2 import AutoSklearnRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_squared_error
import numpy as np
# Load data
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and fit the auto-sklearn regressor
auto_sklearn = AutoSklearnRegressor(time_limit=120, random_state=42)
auto_sklearn.fit(X_train, y_train)
# Make predictions
y_pred = auto_sklearn.predict(X_test)
# Calculate metrics
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"Best model: {auto_sklearn.best_params}")
print(f"R² Score: {r2:.4f}")
print(f"Root Mean Squared Error: {rmse:.4f}")
# Show all models performance
for model_name, score in auto_sklearn.get_models_performance().items():
print(f"{model_name}: {score:.4f}")
Available Models
Classification Models
- RandomForestClassifier
- GradientBoostingClassifier
- LogisticRegression
- SVC and LinearSVC
- KNeighborsClassifier
- MLPClassifier
- DecisionTreeClassifier
- AdaBoostClassifier
- ExtraTreesClassifier
- BaggingClassifier
- SGDClassifier
- GaussianNB, BernoulliNB, MultinomialNB
- QuadraticDiscriminantAnalysis
- LinearDiscriminantAnalysis
- And more...
Regression Models
- RandomForestRegressor
- GradientBoostingRegressor
- LinearRegression
- Ridge, Lasso, ElasticNet
- SVR and LinearSVR
- KNeighborsRegressor
- MLPRegressor
- DecisionTreeRegressor
- AdaBoostRegressor
- ExtraTreesRegressor
- BaggingRegressor
- SGDRegressor
- HuberRegressor
- PoissonRegressor
- GammaRegressor
- TweedieRegressor
- RANSACRegressor
- KernelRidge
- PLSRegression
- And more...
Differences from auto-sklearn
Auto-Sklearn2 is a simplified version of auto-sklearn with the following differences:
- No ConfigSpace Dependency: Uses scikit-learn's built-in models and preprocessing methods
- Python 3.11+ Compatible: Works with the latest Python versions
- No Meta-Learning: Does not use meta-learning to warm-start the optimization
- No Ensemble Building: Does not build ensembles of models
- Simpler Hyperparameter Optimization: Uses cross-validation instead of Bayesian optimization
License
BSD 3-Clause License (same as auto-sklearn)
Citation
If you use Auto-Sklearn2 in a scientific publication, please cite the original auto-sklearn paper:
@inproceedings{feurer-neurips15a,
title = {Efficient and Robust Automated Machine Learning},
author = {Feurer, Matthias and Klein, Aaron and Eggensperger, Katharina and
Springenberg, Jost and Blum, Manuel and Hutter, Frank},
booktitle = {Advances in Neural Information Processing Systems 28},
pages = {2962--2970},
year = {2015}
}
Acknowledgements
This package is inspired by and based on the original auto-sklearn package.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file auto_sklearn2-1.0.0.tar.gz.
File metadata
- Download URL: auto_sklearn2-1.0.0.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a7250aa550085a50128fd265fdd96846cd34318eaefabeb493040f499c6c8d9
|
|
| MD5 |
cb7f9ef825f5155fc849cc4d5e055b47
|
|
| BLAKE2b-256 |
34475c50434e821c0077b03e08e4fe7316d0898ec7ece95c1e251946eea0e98c
|
File details
Details for the file auto_sklearn2-1.0.0-py3-none-any.whl.
File metadata
- Download URL: auto_sklearn2-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ba2a9247af81ab8b5105ad1140dac1d84ab1e3d59c2db25ed829746e7766390
|
|
| MD5 |
2b908678ffca18daa8971921ffe93f56
|
|
| BLAKE2b-256 |
42160c8fbda176512abdb3060d265325d0136f372f62dc2621063d4b9aecec56
|