Skip to main content

RedBrick AI and AWS Sagemaker integration

Project description

redbrick-sagemaker

This package is an integration between RedBrick AI and AWS sagemaker to allow end-to-end Active Learning on computer vision datasets.

The objective of Active Learning is to label your data in order of information gain to your model. Following this strategy can drastically reduce the amount of data you have to label by only labeling those images that help your model improve.

This package will help you run a full end-to-end process where you will be able to iteratively label your dataset and train your model in true Active Learning fashion.

Setup

Install the redbrick_sagemaker package

pip install redbrick_sagemaker

Create an s3 bucket

You will need to create an s3 bucket where your training and model files will get stored. Please follow this tutorial to create an s3 bucket through the CLI, SDK or AWS console.

NOTE: Create your bucket with the sagemaker part of the name, so that you can conditionally give it access. For example - redbrick-sagemaker-bucket.

Create sagemaker role

NOTE: We reccommend you run the redbrick_sagemaker package within a Sagemaker Notebook instance. If you do this, you won't have to create a Sagemaker Execution Role.

If you're running redbrick_sagemaker outside of a Sagemaker Notebook, you need to create a Sagemaker Execution role to allow sagemaker to perform operations on your behalf. Please see this tutorial for creating a AmazonSageMakerFullAccess role. After creating the role, make a note of the ARN.

Use

Standard RedBrick AI set up:

api_key="TODO"
org_id="TODO"
project_id="TOOD"

# The bucket where sagemaker will read/write predictions and training input/outputs.
s3_bucket_name="TODO"
s3_bucket_prefix="TODO"

# OPTIONAL: Add the sagemaker execution role you created here.
# only required if you are running redbrick_sagemaker outside of an AWS sagemaker notebook instance.
# If runnning inside a sagemaker notebook, set role=None
role="TODO"

Create a RedBrick AI Active Learning object:

import redbrick_sagemaker

active_learner = redbrick_sagemaker.ActiveLearner(
    api_key,
    org_id,
    project_id,
    s3_bucket=bucket,
    s3_bucket_prefix=bucket_prefix,
    iam_role=role
)

Begin an Active Learning cycle. Running this for the first time will start a hyperparameter optimization job to train your model.

active_learner.run()

Check on the status of your hyperparameter job.

active_learner.describe()

You can have a detailed view on the progress of your hyperparameter tuning job on your AWS console.

Sagemaker AWS console.

Once your hyperparameter job is complete, you can re-run to perform inference and update Active Learning priorities.

active_learning.run()

If your hyperparameter job is still processing, but there is a model job that has completed, you can force run an inference.

active_learning.run(force_run=True)

If you want to run training, and inference in one go synchronously, you can simply do:

active_learning.run(wait=True)

Please see the flowchart below for an explanation of the different states and flows.

RedBrick Sagemaker active learning flow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redbrick_sagemaker-0.0.4.tar.gz (19.5 kB view hashes)

Uploaded Source

Built Distribution

redbrick_sagemaker-0.0.4-py3-none-any.whl (21.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page