Skip to main content

Jupyter Notebook Manager for Cortx

Project description

# Cortx Jupyter Integration

### Jupyter Notebook Integration for Cortx Object Storage.

Built for [Seagate Cortx Hackathon 2021](https://seagate-cortx-hackathon.devpost.com/)

![logo](https://github.com/sumanthreddym/cortx-jupyter/blob/main/media/cortx_jupyter_header.png)

No more losing precious work because you forgot to save changes or no more worrying about local filesystem crashes or paying exorbitant subscription fees for Premium features of Hosted Jupyter Notebooks. Cortx Jupyter is here to save you from all these! Cortx Jupyter is an Open Source python package which combines the power of Cortx and Jupyter Notebooks to empower you to store all of your Jupyter Notebooks, Checkpoints and Data Files on Cortx Object Storage instead of Jupyter’s standard filesystem-backed storage.

When you opt to use a plain Jupyter notebook as your development environment, everything is saved in your local machine. If you want your Jupyter notebooks to be accessible to you from anywhere or any device, then Cortx Jupyter Integration is the way to go. All of your Jupyter notebooks, checkpoints and data files are saved in your Cortx Object Storage, so that you can access it from anywhere on the go.

Cortx Jupyter Integration can be used by developers and organizations who want a central repository of Notebooks, Checkpoints and Files. This feature can help multiple developers across an organization to collaborate with each other. Cortx Jupyter Integration integration periodically saves updates to your notebook as checkpoints to Cortx Object Storage so that you can either revert to a previous checkpoint or your colleague can continue working on the Jupyter Notebook from where you left.

You don’t have to worry about having notebooks and data saved in different places. With Cortx Jupyter, you can have them together on CORTX: World’s Only 100% Open Source Mass-Capacity Optimized Object Store. Now, you can concentrate on Machine Learning while Cortx Jupyter does the boring work of saving and tracking your work.

## Features

  • Seamlessly Save notebooks, checkpoints, data files to Cortx.

  • Save multiple checkpoints for each notebooks to Cortx.

  • Checkpoints are saved to Cortx, under the key <file_name>/.checkpoints/.

  • Restore from any of the previous checkpoints.

  • Multiple checkpoints are saved.

  • Already, have notebooks on S3? No worries, Cortx Jupyter integration can help you can switch easily from S3 to Cortx Open Source object storage.

  • Read large amount of data to your notebook directly from Cortx High Performance Object Storage for Machine Learning tasks.

  • Delete Notebooks, Files that you don’t need from Cortx.

  • Renaming Notebook name automatically updates Notebook and Checkpoint names on Cortx.

  • Jupyter Notebook is not blocked when requests are made to Cortx as everything has been implemented asynchronously.

  • View, Upload and Download any types of files that are in Cortx using Jupyter

## Prerequisites

### Setup Cortx

Use the instructions at the following link to setup CORTX:

https://github.com/Seagate/cortx/blob/main/QUICK_START.md

## Setup Instructions

### 1. Installation

Install the Cortx Jupyter python package using the following command:

pip install cortx-jupyter

You can find the package on [pypi.org](https://pypi.org/project/cortx-jupyter/)

### 2. Add Jupyter Config

Configure Jupyter to use Cortx Jupyter for its storage backend. This can be done by modifying your notebook config file. On a Unix-like system, your Jupyter Notebook config will be located at ~/.jupyter/jupyter_notebook_config.py

NOTE: If you can’t find this config file on your machine, you can create this file using the following command in terminal:

jupyter notebook –generate-config

Now, edit the ~/.jupyter/jupyter_notebook_config.py file.

NOTE: Please remember to replace credentials(access_key_id , secret_access_key) and endpoint_url with credentials of your Cortx environment.

import cortx_jupyter from cortx_jupyter import CortxJupyter, CortxAuthenticator

c = get_config()

c.NotebookApp.contents_manager_class = CortxJupyter c.CortxJupyter.authentication_class = CortxAuthenticator

c.CortxAuthenticator.access_key_id = “YOUR_ACCESS_KEY_ID” c.CortxAuthenticator.secret_access_key = “YOUR_SECRET_ACCESS_KEY” c.CortxJupyter.endpoint_url = “http://uvo1ettj69aisne19p9.vm.cld.sr” c.CortxJupyter.bucket_name = “testbucket” c.CortxJupyter.prefix = “notebooks/test/”

Following Configuration options are available on CortxAuthenticator:

access_key_id (required) secret_access_key (required)

You can get these credentials

Following Configuration options are available on CortxJupyter:

endpoint_url`*(required)* - Endpoint URL of your Cortx instance. Example: ```http://uvo1ettj69aisne19p9.vm.cld.sr``

bucket_name`*(required)* - Cortx Bucket Name where you want to store your notebook. Example: ```testbucket``

prefix`*(required)* - Path in the bucket where you want to store your notebook. Example: ```notebooks/test/``

## Architecture

![architecture](https://github.com/sumanthreddym/cortx-jupyter/blob/main/media/cortx_jupyer_architecture.png)

## How we built it?

  • Cortx

  • S3 API

  • Python

  • Python Package Index

  • Jupyter

  • boto3

  • tornado

## Demo Video

Watch the video to learn more about the project.

## Contributors:

[Sumanth Reddy Muni](https://www.linkedin.com/in/sumanthmuni/) [Priyadarshini Murugan](https://www.linkedin.com/in/priya-murugan/)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortx_jupyter-0.1.115.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cortx_jupyter-0.1.115-py2-none-any.whl (20.9 kB view details)

Uploaded Python 2

File details

Details for the file cortx_jupyter-0.1.115.tar.gz.

File metadata

  • Download URL: cortx_jupyter-0.1.115.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4

File hashes

Hashes for cortx_jupyter-0.1.115.tar.gz
Algorithm Hash digest
SHA256 6c97bccacc91042c8dd9fbcad0301c903bffc73a5c4472542cd4aa04febe3552
MD5 5a0297dd06d99eb4dfc96ac872a07d4e
BLAKE2b-256 543dc78c6e2e0d77720c8a64a7e931c996761c2549b533a8afc841a1494568ed

See more details on using hashes here.

File details

Details for the file cortx_jupyter-0.1.115-py2-none-any.whl.

File metadata

  • Download URL: cortx_jupyter-0.1.115-py2-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4

File hashes

Hashes for cortx_jupyter-0.1.115-py2-none-any.whl
Algorithm Hash digest
SHA256 b9d1d75cd47bc8542bf48c2e1f3f7d81d0701a958cd70b640253bc7cca052836
MD5 e6065fc3828055e4a1954081c82b2923
BLAKE2b-256 c167094ce154f4e99e9efc5f8430e0232dc44c49e6bcd70f606f2e5bf2808ed9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page