Provide an input CSV and a target field to predict, generate a model + code to run it.
Project description
Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learning model plus native code pipelines allowing you to integrate that model into any prediction workflow. No black box: you can see exactly how the data is processed, how the model is constructed, and you can make tweaks as necessary.
automl-gs is an AutoML tool which, unlike Microsoft’s [NNI](https://github.com/Microsoft/nni), Uber’s [Ludwig](https://github.com/uber/ludwig), and [TPOT](https://github.com/EpistasisLab/tpot), offers a zero code/model definition interface to getting an optimized model and data transformation pipeline in multiple popular ML/DL frameworks, with minimal Python dependencies (pandas + scikit-learn + your framework of choice). automl-gs is designed for citizen data scientists and engineers without a deep statistical background under the philosophy that you don’t need to know any modern data preprocessing and machine learning engineering techniques to create a powerful prediction workflow.
Nowadays, the cost of computing many different models and hyperparameters is much lower than the oppertunity cost of an data scientist’s time. automl-gs is a Python 3 module designed to abstract away the common approaches to transforming tabular data, architecting machine learning/deep learning models, and performing random hyperparameter searches to identify the best-performing model. This allows data scientists and researchers to better utilize their time on model performance optimization.
Generates native Python code; no platform lock-in, and no need to use automl-gs after the model script is created.
Train model configurations super-fast for free using a TPU in Google Colaboratory.
Handles messy datasets that normally require manual intervention, such as datetime/categorical encoding and spaced/parathesized column names.
Each part of the generated model pipeline is its own function w/ docstrings, making it much easier to integrate into production workflows.
Extremely detailed metrics reporting for every trial stored in a tidy CSV, allowing you to identify and visualize model strengths and weaknesses.
Correct serialization of data pipeline encoders on disk (i.e. no pickled Python objects!)
Retrain the generated model on new data without making any code/pipeline changes.
Quit the hyperparameter search at any time, as the results are saved after each trial.
The models generated by automl-gs are intended to give a very strong baseline for solving a given problem; they’re not the end-all-be-all that often accompanies the AutoML hype, but the resulting code is easily tweakable to improve from the baseline.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file automl_gs-0.2.1.tar.gz
.
File metadata
- Download URL: automl_gs-0.2.1.tar.gz
- Upload date:
- Size: 27.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51c4b68e5bdda99fa42643a382760da765118daf44bbddb6b344317c5fa36c3a |
|
MD5 | ba9089591bd731e72be16dbfc5382673 |
|
BLAKE2b-256 | c45127833a08fe4f83711b09836ddd9128e275a6900c47e0e5782112ed611484 |