Skip to main content

Jupyter Notebook Contents Manager for AWS S3

Project description

Jupyter S3

Jupyter Notebook Contents Manager for AWS S3.

Installation

pip install jupyters3

Configuration

To configure Jupyter Notebook to use JupyterS3, you can add the following to your notebook config file.

from jupyters3 import JupyterS3, JupyterS3SecretAccessKeyAuthentication
c = get_config()
c.NotebookApp.contents_manager_class = JupyterS3

and must also set the following settings on c.JupyterS3 in your config file.

Setting Description Example
aws_region The AWS region in which the bucket is located 'eu-west-1'
aws_s3_bucket The name of the S3 bucket. 'my-example-bucket'
aws_s3_host The hostname of the AWS S3 API. Typically, this is of the form s3-<aws_region>.amazonaws.com. 's3-eu-west-1.amazonaws.com'
prefix The prefix to all keys used to store notebooks and checkpoints. This can be the empty string ''. If non-empty, typically this would end in a forward slash /. 'some-prefix/'

You must also, either, authenticate using a secret key, in which case you must have the following configuration

from jupyters3 import JupyterS3SecretAccessKeyAuthentication
c.JupyterS3.authentication_class = JupyterS3SecretAccessKeyAuthentication

and the following settings on c.JupyterS3SecretAccessKeyAuthentication

Setting Description Example
aws_access_key_id The ID of the AWS access key used to sign the requests to the AWS S3 API. ommitted
aws_secret_access_key The secret part of the AWS access key used to sign the requests to the AWS S3 API. ommitted

or authenticate using a role in an ECS container, in which case you must have the following configuration

from jupyters3 import JupyterS3ECSRoleAuthentication
c.JupyterS3.authentication_class = JupyterS3ECSRoleAuthentication

where JupyterS3ECSRoleAuthentication does not have configurable options, or write your own authentication class, such as the one below for EC2/IAM role-based authentication

import datetime
import json

from jupyters3 import (
    AwsCreds,
    JupyterS3Authentication,
)
from tornado import gen
from tornado.httpclient import (
    AsyncHTTPClient,
    HTTPError as HTTPClientError,
    HTTPRequest,
)

class JupyterS3EC2RoleAuthentication(JupyterS3Authentication):

    role_name = Unicode(config=True)
    aws_access_key_id = Unicode()
    aws_secret_access_key = Unicode()
    pre_auth_headers = Dict()
    expiration = Datetime()

    @gen.coroutine
    def get_credentials(self):
        now = datetime.datetime.now()

        if now > self.expiration:
            request = HTTPRequest('http://169.254.169.254/latest/meta-data/iam/security-credentials/' + self.role_name, method='GET')
            creds = json.loads((yield AsyncHTTPClient().fetch(request)).body.decode('utf-8'))
            self.aws_access_key_id = creds['AccessKeyId']
            self.aws_secret_access_key = creds['SecretAccessKey']
            self.pre_auth_headers = {
                'x-amz-security-token': creds['Token'],
            }
            self.expiration = datetime.datetime.strptime(creds['Expiration'], '%Y-%m-%dT%H:%M:%SZ')

        return AwsCreds(
            access_key_id=self.aws_access_key_id,
            secret_access_key=self.aws_secret_access_key,
            pre_auth_headers=self.pre_auth_headers,
        )

configured using

c.JupyterS3.authentication_class = JupyterS3EC2RoleAuthentication
c.JupyterS3EC2RoleAuthentication.role_name = 'my-iam-role-name'

Differences from S3Contents

  • There are no extra dependencies over those already required for Jupyter Notebook. Specifically, there is no virtual filesystem library such as S3FS used, boto3 is not used, and Tornado is used as the HTTP client.

  • Checkpoints are also saved to S3, under the key <file_name>/.checkpoints/.

  • Multiple checkpoints are saved.

  • The event loop is mostly not blocked during requests to S3. There are some exceptions due to Jupyter Notebook expecting certain requests to block.

  • Uploading arbitrary files, such as JPEGs, and viewing them in Jupyter or downloading them, works.

  • Copying and renaming files don't download or re-upload object data from or to S3. "PUT Object - Copy" is used instead.

  • Folders are created using a 0 byte object with key suffix / (forward slash). A single forward slash suffix is consistent with both the AWS Console and AWS AppStream.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wayanjupyters3-0.0.1.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wayanjupyters3-0.0.1-py2-none-any.whl (11.2 kB view details)

Uploaded Python 2

File details

Details for the file wayanjupyters3-0.0.1.tar.gz.

File metadata

  • Download URL: wayanjupyters3-0.0.1.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for wayanjupyters3-0.0.1.tar.gz
Algorithm Hash digest
SHA256 db08b06eebd9173d2b60b9b3a5d193484ef6d54c2c15b44797fa39951e2fbbbf
MD5 70d567d3d403a23a3dff27f99d6456e7
BLAKE2b-256 11724663a57348d378ae25d76605c40bfb2b13e86eea1eb4051054c090625f50

See more details on using hashes here.

File details

Details for the file wayanjupyters3-0.0.1-py2-none-any.whl.

File metadata

  • Download URL: wayanjupyters3-0.0.1-py2-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for wayanjupyters3-0.0.1-py2-none-any.whl
Algorithm Hash digest
SHA256 d2e5faf8c1653784940f4046b50bbf6065eacc80b25d04845546f83b5a5abf2d
MD5 c0eda70198a43ffe44340c92abb71b75
BLAKE2b-256 6eb79f41d2a74f4bc8608c9e1c360f2c946b6f52b89b85f5460e29df4e31ddce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page