auto_ml

Automated machine learning for production and analytics

These details have not been verified by PyPI

Project links

Homepage

Project description

# auto_ml
> Get a trained and optimized machine learning predictor at the push of a button (and, admittedly, an extended coffee break while your computer does the heavy lifting and you get to claim "compiling" https://xkcd.com/303/).

[![Build Status](https://travis-ci.org/ClimbsRocks/auto_ml.svg?branch=master)](https://travis-ci.org/ClimbsRocks/auto_ml)
[![Documentation Status](http://readthedocs.org/projects/auto-ml/badge/?version=latest)](http://auto-ml.readthedocs.io/en/latest/?badge=latest)
[![PyPI version](https://badge.fury.io/py/auto_ml.svg)](https://badge.fury.io/py/auto_ml)
[![Coverage Status](https://coveralls.io/repos/github/ClimbsRocks/auto_ml/badge.svg?branch=master&cacheBuster=1)](https://coveralls.io/github/ClimbsRocks/auto_ml?branch=master)
[![license](https://img.shields.io/github/license/mashape/apistatus.svg)]()


## Installation

- `pip install auto_ml`

OR

- `git clone https://github.com/ClimbsRocks/auto_ml`
- `pip install -r requirements.txt`

## Getting Started

```
import dill
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

from auto_ml import Predictor

# Load data
boston = load_boston()
df_boston = pd.DataFrame(boston.data)
df_boston.columns = boston.feature_names
df_boston['MEDV'] = boston['target']
df_boston_train, df_boston_test = train_test_split(df_boston, test_size=0.2, random_state=42)

# Tell auto_ml which column is 'output'
# Also note columns that aren't purely numerical
# Examples include ['nlp', 'date', 'categorical', 'ignore']
column_descriptions = {
'MEDV': 'output'
, 'CHAS': 'categorical'
}

ml_predictor = Predictor(type_of_estimator='regressor', column_descriptions=column_descriptions)

ml_predictor.train(df_boston_train)

# Score the model on test data
test_score = ml_predictor.score(df_boston_test, df_boston_test.MEDV)

# auto_ml is specifically tuned for running in production
# It can get predictions on an individual row (passed in as a dictionary)
# A single prediction like this takes ~1 millisecond
# Here we will demonstrate saving the trained model, and loading it again
file_name = ml_predictor.save()

# dill is a drop-in replacement for pickle that handles functions better
with open (file_name, 'rb') as read_file:
trained_model = dill.load(read_file)

# .predict and .predict_proba take in either:
# A pandas DataFrame
# A list of dictionaries
# A single dictionary (optimized for speed in production evironments)
predictions = trained_model.predict(df_boston_test)
print(predictions)
```

### Advice

Before you go any further, try running the code. Load up some data (either a DataFrame, or a list of dictionaries, where each dictionary is a row of data). Make a `column_descriptions` dictionary that tells us which attribute name in each row represents the value we're trying to predict. Pass all that into `auto_ml`, and see what happens!

Everything else in these docs assumes you have done at least the above. Start there and everything else will build on top. But this part gets you the output you're probably interested in, without unnecessary complexity.

## Docs

The full docs are available at https://auto_ml.readthedocs.io
Again though, I'd strongly recommend running this on an actual dataset before referencing the docs any futher.

## What this project does

Automates the whole machine learning process, making it super easy to use for both analytics, and getting real-time predictions in production.

A quick overview of buzzwords, this project automates:

- Analytics (pass in data, and auto_ml will tell you the relationship of each variable to what it is you're trying to predict).
- Feature Engineering (particularly around dates, and NLP).
- Robust Scaling (turning all values into their scaled versions between the range of 0 and 1, in a way that is robust to outliers, and works with sparse data).
- Feature Selection (picking only the features that actually prove useful).
- Data formatting (turning a DataFrame or a list of dictionaries into a sparse matrix, one-hot encoding categorical variables, taking the natural log of y for regression problems, etc).
- Model Selection (which model works best for your problem- we try roughly a dozen apiece for classification and regression problems, including favorites like XGBoost if it's installed on your machine).
- Hyperparameter Optimization (what hyperparameters work best for that model).
- Ensembling (Train up a bunch of different estimators, then train a final estimator to intelligently aggregate them together. Also useful if you're just trying to compare many different models and see what works best.)
- Big Data (feed it lots of data- it's fairly efficient with resources).
- Unicorns (you could conceivably train it to predict what is a unicorn and what is not).
- Ice Cream (mmm, tasty...).
- Hugs (this makes it much easier to do your job, hopefully leaving you more time to hug those those you care about).



### Running the tests

If you've cloned the source code and are making any changes (highly encouraged!), or just want to make sure everything works in your environment, run
`nosetests -v tests`.

The tests are pretty comprehensive, though as with everything with auto_ml, I happily welcome your contributions here!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.9.10

Feb 22, 2018

2.9.9

Feb 9, 2018

2.9.8

Jan 21, 2018

2.9.7

Jan 21, 2018

2.9.6

Jan 21, 2018

2.9.5

Jan 18, 2018

2.9.4

Dec 8, 2017

2.9.3

Dec 7, 2017

2.9.2

Dec 6, 2017

2.9.1

Dec 6, 2017

2.9.0

Nov 29, 2017

2.8.5

Nov 17, 2017

2.8.4

Nov 9, 2017

2.8.3

Nov 9, 2017

2.8.2

Nov 9, 2017

2.8.1

Nov 9, 2017

2.8.0

Nov 8, 2017

2.7.7

Oct 12, 2017

2.7.6

Oct 5, 2017

2.7.5

Sep 30, 2017

2.7.4

Sep 25, 2017

2.7.3

Sep 17, 2017

2.7.2

Sep 16, 2017

2.7.1

Sep 14, 2017

2.7.0

Sep 12, 2017

2.6.0

Sep 9, 2017

2.5.0

Jul 23, 2017

2.4.0

Jul 14, 2017

2.3.5

Jul 9, 2017

2.3.4

Jul 5, 2017

2.3.3

Jul 4, 2017

2.3.2

Jun 30, 2017

2.3.1

Jun 16, 2017

2.3.0

Jun 14, 2017

2.2.3

Jun 13, 2017

2.2.2

Jun 13, 2017

2.2.1

Jun 13, 2017

2.2.0

Jun 6, 2017

2.1.9

Jun 2, 2017

2.1.8

May 25, 2017

2.1.7

May 25, 2017

2.1.6

May 24, 2017

2.1.5

May 18, 2017

2.1.4

May 11, 2017

2.1.3

May 4, 2017

2.1.2

May 3, 2017

2.1.1

Apr 20, 2017

2.1.0

Apr 19, 2017

2.0.1

Apr 5, 2017

2.0.0

Apr 4, 2017

1.11.2

Mar 15, 2017

1.11.1

Mar 14, 2017

1.11.0

Mar 14, 2017

1.10.0

Mar 11, 2017

1.9.7

Mar 2, 2017

This version

1.9.6

Jan 12, 2017

1.9.5

Jan 9, 2017

1.9.4

Jan 9, 2017

1.9.3

Jan 8, 2017

1.9.2

Jan 2, 2017

1.9.1

Dec 14, 2016

1.9

Dec 10, 2016

1.8

Nov 24, 2016

1.7

Nov 11, 2016

1.6.3

Nov 2, 2016

1.6.2

Nov 2, 2016

1.6.1

Nov 2, 2016

1.6

Nov 2, 2016

1.5.2

Nov 1, 2016

1.5.1

Nov 1, 2016

1.5

Oct 28, 2016

1.4

Oct 17, 2016

1.3

Oct 11, 2016

1.2.1

Oct 7, 2016

1.2.0

Sep 29, 2016

1.1.0

Sep 2, 2016

1.0.0

Aug 29, 2016

0.5.0

Aug 26, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_ml-1.9.6.tar.gz (53.1 kB view details)

Uploaded Jan 12, 2017 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

auto_ml-1.9.6-py2.py3-none-any.whl (48.1 kB view details)

Uploaded Jan 12, 2017 Python 2Python 3

File details

Details for the file auto_ml-1.9.6.tar.gz.

File metadata

Download URL: auto_ml-1.9.6.tar.gz
Upload date: Jan 12, 2017
Size: 53.1 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for auto_ml-1.9.6.tar.gz
Algorithm	Hash digest
SHA256	`657b8856cc050a3c0ec7385dd5d95b02400e0e3821c600375f5f160722862955`
MD5	`1ebed06741ac8f3e73d573b8e113c6d1`
BLAKE2b-256	`7270c8fff1b39570542a614715a6dbf6b395e393a280de9f8d1fe5456280ecaa`

See more details on using hashes here.

File details

Details for the file auto_ml-1.9.6-py2.py3-none-any.whl.

File metadata

Download URL: auto_ml-1.9.6-py2.py3-none-any.whl
Upload date: Jan 12, 2017
Size: 48.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for auto_ml-1.9.6-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`2c1447fe266b60b9825ba557f39acf38d90a10d8c2c1ad891be28ec7bc9427d0`
MD5	`9baba5d0aa5ec764faef7477d02ce7b1`
BLAKE2b-256	`d2611d5fc9371000fbbede8acd10d467ad8e1d2135128cd7ce9293d9ee1cc8cd`

See more details on using hashes here.

auto_ml 1.9.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes