The A2ML ("Automate AutoML") project is a set of scripts to automate Automated Machine Learning tools from multiple vendors.

These details have not been verified by PyPI

Project links

Homepage

Project description

a2ml - Automation of AutoML

The A2ML ("Automate AutoML") project is a Python API and set of command line tools to automate Automated Machine Learning tools from multiple vendors. The intention is to provide a common API for all Cloud-oriented AutoML vendors. Data scientists can then train their datasets against multiple AutoML models to get the best possible predictive model. May the best "algorithm/hyperparameter search" win.

The PREDIT Pipeline

Every AutoML vendor has their own API to manage the datasets and create and manage predictive models. They are similar but not identical APIs. But they share a common set of stages:

Importing data for training
Train models with multiple algorithms and hyperparameters
Evaluate model performance and choose one or more for deployment
Deploy selected models
Predict results with new data against deployed models
Review performance of deployed models

Since ITEDPR is hard to remember we refer to this pipeline by its conveniently mnemonic anagram: "PREDIT" (French for "predict"). The A2ML project provides classes which implement this pipeline for various Cloud AutoML providers and a command line interface that invokes stages of the pipeline.

Command Line Interface

The command line is a convenient way to start an A2ML project even if you plan to use the API.

Creating a New A2ML Project

Specifically, you can start a new A2ML project with the new command supplying a project name. A2ML will create a directory which has a default set of configuration files that you can then more specifically configure.

$ a2ml new test_app

Configuring Your A2ML Project

Before you use the Python API or the command line interface for the specific PREDIT pipeline steps you will need to configure your particular project. This includes both general options that apply to all vendors and vendor specific options in separate YAML files.

After a new A2ML application is created, application configuration for all providers are stored in CONFIG.YAML. The options available include:

name - the name of the project
provider - the AutoML provider: GC (for Google Cloud), AZ (for Microsoft Azure), or Auger
source - the CSV file to train with. Can be a local file path (for Auger or Azure). Can be a hosted file URL. Can be URL for Google Cloud Storage ("gs://...") for Google Cloud AutoML.
exclude - features from the dataset to exclude from the model
target - the feature which is the target
model_type - Can be regression, classification or timeseries
budget - the time budget in milliseconds to train

Examples of options which apply to specific vendors include:

region - the region for the AutoML providers compute clusters, each vendor has different names for their regions
metric - how to measure the accuracy of the model to perform the search of algorithms, each vendor has different names for their regions

Here is an example CONFIG.YAML with options that apply to all AutoML providers:

name: moneyball
providers: google,azure,auger
source: gs://moneyball/baseball.csv
exclude: Team,League,Year
target: 'RS'
model_type: regression
budget: 3600

GOOGLE.YAML Configuration

Here is an example specific configuration file (google.yaml) for Google AutoML for this project:

region: us-central1
metric: MINIMIZE_MAE
project: automl-test-237311
dataset_id: TBL1889796605356277760
operation_id: TBL2145477039279308800
operation_name: projects/291533092938/locations/us-central1/operations/TBL4473943599746121728
model_name: projects/291533092938/locations/us-central1/models/TBL1517370026795991040

AUGER.YAML

Here's an example configuration file for Auger.AI

project: test_app
dataset: some_test_data

experiment:
  cross_validation_folds: 5
  max_total_time: 60
  max_eval_time: 1
  max_n_trials: 10
  use_ensemble: true
  metric: f1_macro

cluster:
  type: high_memory
  min_nodes: 1
  max_nodes: 4
  stack_type: experimental

Once your project is configured with these YAML files you can skip ahead to the Using the A2ML API section if you want to start using the A2ML Python API.

The A2ML CLI Commands Available

Below are the full set of commands provided by A2ML. Command line options are provided for each stage in the PREDIT Pipeline.

Usage:

$ a2ml [OPTIONS] COMMAND [ARGS]...

Commands:

new Create new A2ML application.
import Import data for training.
train Train the model.
evaluate Evaluate models after training.
deploy Deploy trained model.
predict Predict with deployed model.
review Review specified model info.
project Project(s) management
dataset Dataset(s) management
experiment Experiment(s) management
model Model(s) management

To get detailed information on available options for each command, please run:

$ a2ml command --help

Using the A2ML API

After you have configured the YAML files as shown above (whether from scratch or using the templates provided by "a2ml new") you can use the API to import, train, evaluate, deploy, predict and review (the PREDIT pipeline). These configured files should be in the directory you are running from.

In your Python code, you will first need retrieve the configuration by referring to a Context() object. Then you can create a client for the A2ML class. From that client object you will execute the various PREDIT pipeline methods (starting from "import_data"). Below is example Python code for this.

import os
from a2ml.api.a2ml import A2ML
from a2ml.api.utils.context import Context
ctx = Context()
a2ml = A2ML(ctx)
result = a2ml.import_data()

Development Setup

We strongly recommend to install Python virtual environment:

$ pip install virtualenv virtualenvwrapper

Clone A2ML:

$ git clone https://github.com/augerai/a2ml.git

Setup dependencies and A2ML command line:

$ pip install -e ".[all]"

Running tests and getting test coverage:

$ tox

Authentication with A2ML

Authentication with A2ML involves two parts. First, there is authentication between your client (whether it's the a2ml cli or the a2ml python API) and the service endpoint (either self-hosted or with Auger.AI). Second, there is authentication between the service endpoint and each provider. Note that in the case where you run A2ML locally, endpoint authentication is handled automatically. The table at the end of this section shows this in more detail.

Authenticating with Auger.AI

You can login to the Auger.AI endpoint and provider with the a2ml auth login command.

a2ml auth login

You will be prompted for your Auger service user and password. You can also download your Auger credentials as a credentials.json file and refer to it with an AUGER_CREDENTIALS environment variable.

export AUGER_CREDENTIALS=~/auger_credentials.json

You can also put the path to credentials.json in an environment variable called AUGER_CREDENTIALS_PATH OR a key inside AUGER.YAML.

The Auger service can manage your usage of Google Cloud AutoML or Azure AutoML for you. If you choose to set up your own endpoints, you must configure the underlying AutoML service corrrectly to be accessed from the server you are running from. Here are abbreviated directions for that step for Google, Azure and Auger.

Google Cloud AutoML

If you haven't run Google Cloud AutoML, set up a service account and save the credentials to a JSON file which you store in your project directory. Then set up the GOOGLE_APPLICATION CREDENTIALS environment variable to point to the saved file. For example:

export GOOGLE_APPLICATION_CREDENTIALS="/Users/adamblum/a2ml/automl.json"

For ease of use you can set up a default project ID to use with your project with the PROJECT_ID environment variable. For example:

export PROJECT_ID="automl-test-237311"

Detailed instructions for setting up Google Cloud AutoML are here])

Azure AutoML

The Azure AutoML service allows credentials to be downloaded as a JSON file (such as a config.json file). This should then be placed in a .azureml subdirectory of your project directory. Be sure to include this file in your .gitignore:

**/.azureml/config.json

The Azure subscription ID can be set with the AZURE_SUBSCRIPTION_ID environment variable as in the following example.

export AZURE_SUBSCRIPTION_ID="d1b17dd2-ba8a-4492-9b5b-10c6418420ce"

A2ML Authentication Components

The following shows which authentication components are necessary depending on your A2ML use case:

	Auger.AI AutoML	Azure AutoML	Google Cloud AutoML
Auger.AI Endpoint
Provider Credentials Required?	Yes	No	No

Self-Hosted Endpoint
Provider Credentials Required?	Yes	Yes	Yes

Implementing A2ML for Another AutoML Provider

The A2ML Model class in A2ML.PY abstracts out the PREDIT (ITEDPR) pipeline. Implementations are provided for Google Cloud AutoML Tables (GCModel), Azure AutoML (AZModel) and Auger.AI (Auger). If you want to add support for another AutoML provider of your choice, implement a child class of Model as shown below (replacing each "pass" with your own code.

class AnotherAutoMLModel(Model):  
    def __init__(self):
        pass     
    def predict(self,filepath,score_threshold):
        pass
    def review(self):
        pass
    def evaluate(self):
        pass
    def deploy(self):
        pass
    def import_data(self):
        pass
    def train(self):
        pass

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.0

Dec 21, 2024

1.0.97

Jan 20, 2024

1.0.95

Dec 20, 2023

1.0.94

Jun 12, 2023

1.0.93

Jun 9, 2023

1.0.92

May 24, 2023

1.0.91

Apr 5, 2023

1.0.90

Apr 3, 2023

1.0.89

Mar 21, 2023

1.0.88

Mar 5, 2023

1.0.87

Jan 19, 2023

1.0.86

Jan 12, 2023

1.0.85

Dec 21, 2022

1.0.84

Nov 21, 2022

1.0.83

Oct 7, 2022

1.0.82

Oct 4, 2022

1.0.81

Oct 4, 2022

1.0.79

Jun 24, 2022

1.0.78

May 13, 2022

1.0.76

Apr 26, 2022

1.0.75

Apr 14, 2022

1.0.74

Apr 10, 2022

1.0.73

Apr 9, 2022

1.0.72

Apr 8, 2022

1.0.71

Mar 26, 2022

1.0.70

Mar 25, 2022

1.0.68

Mar 23, 2022

1.0.67

Mar 18, 2022

1.0.66

Mar 17, 2022

1.0.65

Feb 23, 2022

1.0.64

Feb 19, 2022

1.0.63

Feb 13, 2022

1.0.62

Feb 3, 2022

1.0.61

Feb 2, 2022

1.0.59

Jan 28, 2022

1.0.58

Jan 26, 2022

1.0.57

Jan 25, 2022

1.0.55

Jan 20, 2022

1.0.54

Jan 19, 2022

1.0.53

Jan 14, 2022

1.0.52

Jan 11, 2022

1.0.51

Jan 8, 2022

1.0.50

Jan 4, 2022

1.0.49

Dec 20, 2021

1.0.48

Dec 16, 2021

1.0.47

Nov 24, 2021

1.0.46

Nov 19, 2021

1.0.45

Nov 17, 2021

1.0.44

Nov 12, 2021

1.0.43

Nov 4, 2021

1.0.42

Oct 19, 2021

1.0.41

Oct 12, 2021

1.0.40

Oct 6, 2021

1.0.39

Sep 25, 2021

1.0.38

Sep 24, 2021

1.0.37

Aug 19, 2021

1.0.36

Aug 19, 2021

1.0.35

Aug 13, 2021

1.0.34

Aug 10, 2021

1.0.33

Aug 10, 2021

1.0.32

Jul 14, 2021

1.0.31

Jul 13, 2021

1.0.30

Jul 8, 2021

1.0.29

Jul 3, 2021

1.0.28

Jul 2, 2021

1.0.27

Jul 2, 2021

1.0.25

Jun 6, 2021

1.0.24

Jun 6, 2021

1.0.23

Jun 4, 2021

1.0.22

Jun 3, 2021

1.0.21

Jun 3, 2021

1.0.20

Jun 2, 2021

1.0.19

May 24, 2021

1.0.17

May 14, 2021

1.0.16

Apr 28, 2021

1.0.14

Apr 19, 2021

1.0.10

Mar 3, 2021

1.0.9

Mar 1, 2021

1.0.8

Feb 25, 2021

1.0.7

Feb 22, 2021

1.0.6

Feb 20, 2021

1.0.5

Feb 20, 2021

1.0.4

Feb 18, 2021

1.0.2

Jan 26, 2021

1.0.1

Jan 25, 2021

0.6.18

Jan 8, 2021

0.6.17

Jan 7, 2021

0.6.16

Dec 25, 2020

0.6.15

Dec 23, 2020

0.6.14

Dec 8, 2020

0.6.12

Dec 3, 2020

0.6.11

Nov 26, 2020

0.6.10

Nov 20, 2020

0.6.9

Nov 12, 2020

0.6.8

Nov 3, 2020

0.6.8.dev1 pre-release

Nov 2, 2020

0.6.7

Oct 17, 2020

0.6.5

Oct 13, 2020

0.6.4

Oct 13, 2020

0.6.2

Oct 12, 2020

0.6.1

Oct 11, 2020

0.6.0

Oct 9, 2020

0.5.3

Sep 15, 2020

0.5.2

Sep 6, 2020

0.5.1

Sep 6, 2020

0.5.0

Sep 2, 2020

0.5.0rc13 pre-release

Aug 25, 2020

0.5.0rc12 pre-release

Aug 23, 2020

0.5.0rc11 pre-release

Aug 19, 2020

0.5.0rc10 pre-release

Aug 11, 2020

0.5.0rc9 pre-release

Aug 7, 2020

0.5.0rc7 pre-release

Aug 4, 2020

0.5.0rc6 pre-release

Aug 3, 2020

0.5.0rc5 pre-release

Jul 21, 2020

0.5.0rc2 pre-release

Jul 14, 2020

0.5.0rc1 pre-release

Jul 11, 2020

0.4.0

May 21, 2020

0.3.8

May 17, 2020

0.3.7

May 16, 2020

0.3.6

May 12, 2020

0.3.5

May 8, 2020

0.3.4

Apr 30, 2020

0.3.3

Apr 29, 2020

0.3.2

Apr 28, 2020

0.3.1

Apr 28, 2020

0.3.0

Apr 27, 2020

0.2.5

Apr 23, 2020

0.2.4

Apr 23, 2020

0.2.3

Apr 22, 2020

0.2.2

Apr 22, 2020

0.2.1

Apr 21, 2020

0.2.0

Apr 20, 2020

0.1.6

Apr 3, 2020

0.1.5

Mar 25, 2020

0.1.4

Mar 22, 2020

0.1.3

Mar 21, 2020

This version

0.1.2

Mar 18, 2020

0.1.1

Feb 21, 2020

0.1.0

Feb 20, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

a2ml-0.1.2.tar.gz (30.5 kB view details)

Uploaded Mar 18, 2020 Source

Built Distribution

a2ml-0.1.2-py3-none-any.whl (47.9 kB view details)

Uploaded Mar 18, 2020 Python 3

File details

Details for the file a2ml-0.1.2.tar.gz.

File metadata

Download URL: a2ml-0.1.2.tar.gz
Upload date: Mar 18, 2020
Size: 30.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for a2ml-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`9131fdffef5748aa544550992f67c62daebf749735a4e32ddffbacec69b9bc82`
MD5	`0a9f29c62bb77cf4f28d179b49f070fa`
BLAKE2b-256	`7e25cdd008d5a54aff5f26677f14411883b1276baa55f21f358b3aa37df241f8`

See more details on using hashes here.

File details

Details for the file a2ml-0.1.2-py3-none-any.whl.

File metadata

Download URL: a2ml-0.1.2-py3-none-any.whl
Upload date: Mar 18, 2020
Size: 47.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for a2ml-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bd716d61d77e429f4e7db9dcfc2168d0be72f3a6ae740779e76e044d081fcfe7`
MD5	`e7e3d70429787ca27c47c22b883183ee`
BLAKE2b-256	`f6b40e84a75bbec4bc6a8e29a2ce382361970a5666831f03c9eb827e19674d3a`

See more details on using hashes here.

a2ml 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

a2ml - Automation of AutoML

The PREDIT Pipeline

Command Line Interface

Creating a New A2ML Project

Configuring Your A2ML Project

GOOGLE.YAML Configuration

AUGER.YAML

The A2ML CLI Commands Available

Using the A2ML API

Development Setup

Authentication with A2ML

Authenticating with Auger.AI

Google Cloud AutoML

Azure AutoML

A2ML Authentication Components

Implementing A2ML for Another AutoML Provider

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes