A Python AutoML tool for fast exploration and experimentation of supervised machine learning pipelines.
Automated Tool for Optimized Modelling
Automated Tool for Optimized Modelling (ATOM) is a python package designed for fast exploration and experimentation of supervised machine learning tasks. With just a few lines of code, you can perform basic data cleaning steps, feature selection and compare the performance of multiple models on a given dataset. ATOM should be able to provide quick insights on which algorithms perform best for the task at hand and provide an indication of the feasibility of the ML solution. This package supports binary classification, multiclass classification, and regression tasks.
|NOTE: A data scientist with domain knowledge can outperform ATOM if he applies usecase-specific feature engineering or data cleaning steps!|
Possible steps taken by the ATOM pipeline:
- Data Cleaning
- Handle missing values
- Encode categorical features
- Balance the dataset
- Remove outliers
- Perform feature selection
- Remove features with too high collinearity
- Remove features with too low variance
- Select best features according to a chosen strategy
- Fit all selected models (either direct or via successive halving)
- Select hyperparameters using a Bayesian Optimization approach
- Perform bagging to assess the robustness of the model
- Analyze the results using the provided plotting functions!
Intall ATOM easily using
|NOTE: Since atom was already taken, the name of the package in pypi is
pip install atom-ml
ATOMRegressor class and provide the data you want to use:
from sklearn.datasets import load_breast_cancer from sklearn.datasets import X, y = load_breast_cancer(return_X_y) atom = ATOMClassifier(X, y, log='auto', n_jobs=2, verbose=2)
ATOM has multiple data cleaning methods to help you prepare the data for modelling:
atom.impute(strat_num='knn', strat_cat='most_frequent', max_frac_rows=0.1) atom.encode(max_onehot=10, frac_to_other=0.05) atom.outliers(max_sigma=4) atom.balance(oversample=0.8, n_neighbors=15) atom.feature_selection(strategy='univariate', solver='chi2', max_features=0.9)
Run the pipeline with different models:
atom.pipeline(models=['LR', 'LDA', 'XGB', 'lSVM'], metric='f1', max_iter=10, max_time=1000, init_points=3, cv=4, bagging=10)
Make plots and analyze results:
atom.plot_bagging(filename='bagging_results.png') atom.lSVM.plot_probabilities(figsize=(9, 6)) atom.lda.plot_confusion_matrix(normalize=True)
For further information about ATOM, please see the project documentation.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size atom-ml-3.0.1.tar.gz (44.8 kB)||File type Source||Python version None||Upload date||Hashes View|