Skip to main content

Package to orchestrate architecture in AWS

Project description

build-status coverage-status pypi-reference pypi-downloads

SageCreator

SageCreator is a package meant to simplify cluster setup for Machine Learning in AWS.
It does all the heavy lifting to get cluster up and running in a matter of minutes on any of the AWS instance type(s).
It uses spot instances by default which can significantly reduce total cost of running the cluster.
If spot instances are not available or the specified spot instance price is too low, it falls back to on-demand instances.

You can access Jupyter notebook that can run your code against provisioned server(s). See Jupyter access for more info.

Installation

Install and update using pip:

$ pip install sagecreator

Python 3 is required and it is highly recommended to install and run the package in virtualenv. Supported in Python >= 3.5.0.

Prerequisites

AWS Account

To provision the cluster you need an AWS Account and an IAM user with:

  • Access Key ID

  • Secret Access Key

User should either be in Administrators group as described in IAM user tutorial, or create a custom IAM policy

Execution

After the installation, configure the tool by specifying configuration parameters:

$ sage configure
Access key id: <AWS Access Key ID>
Secret access key: <AWS Secret Access Key>
Company: <Name of your organization>
Owner: <Name of your team>
Key pair name: <Name of the key pair> (Optional - if NOT provided it will be created with a new private key)
Private key file: <Absolute path to private key file> (required only if Key pair name was provided)
Company, Owner, Service are required - those are used as tags for each instance in the cluster.
Key pair name, Private key file are optional - if provided, given ‘key pair name’ / ‘private key file’ will be used to provision the cluster.

Provision the cluster.
Provision step can take up to 20 minutes depending on the network connection, cluster size, and instance type.
$ sage provision
Service: <Name of your service>
Instance type [t3.small]: <Instance type> (Optional, defaults to t3.small)
Spot instance price [0.1]: <Spot instance price> (Optional, defaults to $0.1 per instance)
Cluster size [1]: <Cluster size> (Optional, defaults to 1 node)
https://s3.amazonaws.com/romanjoffee/github/sagecreator/provision1080.gif
Important:
The tool provides NO guarantee that the instance(s) will be provisioned at specified Spot instance price.
If specified price is lower than the current AWS spot instance price then On-demand instance(s) will be provisioned instead.
Thus, it is up to the user to ensure that specified price is high enough for the request to be fulfilled.

Display path of the cluster configuration file.
Though not necessary it is possible to manually edit that file with customizations prior to running provision step.
$ sage pwd

Terminate cluster. This operation terminates all cluster nodes matching tags tuple of Company, Owner, Service.
$ sage terminate
Service: <Name of your service to terminate>

Jupyter access

Once provisioning step is done and the cluster is up you can access jupyter notebook in your browser at http://localhost:9000.
We have provided a sample notebook to execute. It trains the model on Fashion MNIST dataset using CNN in Keras.

Under the hood

The logic that orchestrates the cluster is written in Ansible

Custom IAM policy

Alternatively, instead of assigning user to Administrators group which has access to all AWS services (as described in IAM user), you can create separate Group named Provisioners with more restrictive policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "ec2:*",
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": "rds:*",
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": "route53:*",
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

Then assign the user to the Provisioners group which has access to a subset of AWS services that are sufficient to orchestrate the cluster.

SSH access

If Key pair name / Private key file were NOT provided when configuring the cluster then default key pair is created and a new private key is stored locally.
In order to ssh into the servers point ssh to the correct (private key) file:
$ ssh -i <path to private key file> ubuntu@<host>

where path to private key file is ../venv/lib/python3.X/site-packages/sagebase/.ssh/pkey.pem

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sagecreator-0.1.1.6.tar.gz (46.1 kB view details)

Uploaded Source

Built Distribution

sagecreator-0.1.1.6-py3-none-any.whl (54.1 kB view details)

Uploaded Python 3

File details

Details for the file sagecreator-0.1.1.6.tar.gz.

File metadata

  • Download URL: sagecreator-0.1.1.6.tar.gz
  • Upload date:
  • Size: 46.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for sagecreator-0.1.1.6.tar.gz
Algorithm Hash digest
SHA256 6f9133cd4fc37f19f62a4f77bc87513ba364a56549c9cadd873f2ff3e618ba37
MD5 93ab4f67beed10298c2c5b3d6c4bc538
BLAKE2b-256 d043a22809aef1927ed2fb7172cd48d54934609b7dd118971b16f65277e0c6e2

See more details on using hashes here.

File details

Details for the file sagecreator-0.1.1.6-py3-none-any.whl.

File metadata

  • Download URL: sagecreator-0.1.1.6-py3-none-any.whl
  • Upload date:
  • Size: 54.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for sagecreator-0.1.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2ff3d60612c2d91116fd722a5d9773639755b25cd1c83197597b708b13f74c73
MD5 dfd91543effc743383414f3e1cfd8670
BLAKE2b-256 390863ca720693f43404f91f042002ab6bf36b63931b1cb0ee5a4c66c5aadecd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page