Skip to main content

A Python AutoML tool for fast exploration and experimentation of supervised machine learning pipelines.

Project description


Automated Tool for Optimized Modelling

Author: tvdboom

Project Status: Active Build Status codecov Language grade: Python Python 3.6|3.7|3.8 License: MIT PyPI version


Automated Tool for Optimized Modelling (ATOM) is a python package designed for fast exploration and experimentation of supervised machine learning tasks. With just a few lines of code, you can perform basic data cleaning steps, feature selection and compare the performance of multiple models on a given dataset. ATOM should be able to provide quick insights on which algorithms perform best for the task at hand and provide an indication of the feasibility of the ML solution. This package supports binary classification, multiclass classification, and regression tasks.

NOTE: A data scientist with domain knowledge can outperform ATOM if he applies usecase-specific feature engineering or data cleaning steps!

Possible steps taken by the ATOM pipeline:

  1. Data Cleaning
    • Handle missing values
    • Encode categorical features
    • Balance the dataset
    • Remove outliers
  2. Perform feature selection
    • Remove features with too high collinearity
    • Remove features with too low variance
    • Select best features according to a chosen strategy
  3. Fit all selected models (either direct or via successive halving)
    • Select hyperparameters using a Bayesian Optimization approach
    • Perform bagging to assess the robustness of the model
  4. Analyze the results using the provided plotting functions!



Intall ATOM easily using pip.

NOTE: Since atom was already taken, the name of the package in pypi is atom-ml!
	pip install atom-ml


Call the ATOMClassifier or ATOMRegressor class and provide the data you want to use:

from sklearn.datasets import load_breast_cancer  
from atom import ATOMClassifier 

X, y = load_breast_cancer(return_X_y)
atom = ATOMClassifier(X, y, log='auto', n_jobs=2, verbose=2)

ATOM has multiple data cleaning methods to help you prepare the data for modelling:

atom.impute(strat_num='knn', strat_cat='most_frequent',  min_frac_rows=0.7)  
atom.encode(max_onehot=10, frac_to_other=0.05)  
atom.balance(oversample=0.8, n_neighbors=15)  
atom.feature_selection(strategy='univariate', solver='chi2', n_features=0.9)

Run the pipeline with different models:

atom.pipeline(models=['LR', 'LDA', 'XGB', 'lSVM'],

Make plots and analyze results:

atom.lSVM.plot_probabilities(figsize=(9, 6))  


For further information about ATOM, please see the project documentation.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for atom-ml, version 3.3.0
Filename, size File type Python version Upload date Hashes
Filename, size atom-ml-3.3.0.tar.gz (60.2 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page