aws-jupyter

Launch Jupyter notebook on AWS

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

This script launches a cluster on AWS EC2 instances and starts a Jupyter notebook on them.

Please read the manual for the details of all supported commands. QuickStartGuide.md offers a guided example to show what aws-jupyter can do.

aws-jupyter is intended as a commandline tool, but you can integrate it to a custom Python script. module-doc.md provides a brief how-to.

Install

Pleasure ensure you have Python 3. aws-jupyter can be install using pip:

pip install aws-jupyter

After the installation, you can try it out using the example launch script:

launch-aws-jupyter

It will create a cluster with 2 spot instances.

In addition, we create the EC2 instances using an AMI image that located in the region us-west-2. So please make sure that your local environment is set up to use that region (you can run aws-jupyter config to verify the setting).

After installation, please run aws-jupyter config to make sure the configuration is properly set.

Upgrade

In case we change the default AWS region, please upgrade aws-jupyter in 3 steps:

Upgrade the package: pip install --upgrade aws-jupyter.
Create the new key pair in the new AWS region, and modify the "key_name" and "ssh_key" fields of the credential file accordingly (read below).
Switch to default AMI and region: aws-jupyter config --default-ami --default-region, and set the credential file to the location of the new one when prompted.

AWS Credential

The scripts in this repository requires a credentials.yml file in following format:

Arbitrary Name:
  access_key_id: your_aws_access_key_id
  secret_access_key: your_aws_secret_access_key
  key_name: your_ec2_key_pair_name
  ssh_key: /path/to/the/ec2/key/pair/file

The credential file in the Spark Notebook project can be directly used here.

The credential file (or a soft link to it) should be located in the same folder where you invoke these scripts (i.e. you should be able to see it using ls . command). The credential file must always stay private and not be shared. Remember to add credential.yml to the .gitignore file of your project so that this file would not be pushed to GitHub.

Usage

Run any script in this directory with -h argument will print the help message of the script.

Create a new cluster

aws-jupyter create creates a cluster on the m3.xlarge instance using an AMI based on Ubuntu.

If the instance comes with attached SSD, it will be mounted to /mnt.

Example:

aws-jupyter create -c 2 --name testing

Check if a cluster is ready

aws-jupyter check checks if a cluster is up and running. In addition, it also creates a neighbors.txt file which contains the IP addresses of all the instances in the cluster.

Example

aws-jupyter check --name testing

Terminate a cluster

aws-jupyter terminate terminates a cluster by stopping and terminating all instances in this cluster.

Example

aws-jupyter terminate --name testing

Run a script on a cluster

aws-jupyter run runs a given script on all instances in the cluster. It starts the script in the background, and redirect the stdout/stderr into a file on the instances which can be checked later. Thus it terminates does not necessarily mean the script has finshed executing on the cluster. In addition, it only launches the script on all instances, but does not check if the script executes without error.

Example

aws-jupyter run --script ./script-examples/hello-world.sh

Send a local directory to the remote instances

Send a local directory to all instances of a cluster. It can be used to, for example, distributing the configuratoin files to all instances.

Example

aws-jupyter send-dir --local ./configs/ --remote ~/remote-configs

Retrieve files from all instances in a cluster

Retrieve files from the same location on all instances of a cluster. It can be used to collect the output of the program from the workers. A local directory for saving the downloaded files should be provided to this script. This script will create a separate sub-directory for each worker and download its files to this sub-directory.

Example

mkdir _result
aws-jupyter retrieve --remote /tmp/std* --local ./_result/

Install a package on all instances

You can install any missing packages after the instances are ready (i.e. aws-jupyter check shows Jupyter notebook URL) by following these step:

create a script on your local computer, which install the required package. For example, assume we want to install pandas,

$ echo "pip install pandas" > install-pandas.sh

run the script on all instances

aws-jupyter run -s install-pandas.sh --output

The --output argument ensures that the script will run in foreground, so that you can check if the installation succeed.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.24

May 12, 2020

0.1.23

Apr 15, 2020

0.1.22

Apr 15, 2020

0.1.21

Apr 14, 2020

0.1.20

Apr 7, 2020

0.1.19

Apr 7, 2020

0.1.18

Mar 23, 2020

0.1.17

Mar 23, 2020

0.1.16

Mar 17, 2020

0.1.15

Mar 16, 2020

0.1.12

Mar 12, 2020

0.1.11

Mar 12, 2020

0.1.10

Mar 12, 2020

0.1.9

Mar 11, 2020

0.1.8

Mar 9, 2020

0.1.7

Mar 9, 2020

0.1.6

Mar 9, 2020

0.1.5

Mar 9, 2020

0.1.4

Mar 9, 2020

0.1.3

Mar 9, 2020

0.1.2

Mar 9, 2020

0.1.1

Mar 9, 2020

0.1

Mar 9, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws-jupyter-0.1.24.tar.gz (20.3 kB view hashes)

Uploaded May 12, 2020 Source

Built Distribution

aws_jupyter-0.1.24-py3-none-any.whl (24.1 kB view hashes)

Uploaded May 12, 2020 Python 3

Hashes for aws-jupyter-0.1.24.tar.gz

Hashes for aws-jupyter-0.1.24.tar.gz
Algorithm	Hash digest
SHA256	`9586755edcddaca6ee0eecd92a98dc8af76603a1e4664632550b6bc8ef62759f`
MD5	`e2aab2d9c646b29cd4de0667591ef03f`
BLAKE2b-256	`7c6d26d2be9ad88fd40622942623e3e642246df232e1e1922176c8389b792121`

Hashes for aws_jupyter-0.1.24-py3-none-any.whl

Hashes for aws_jupyter-0.1.24-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5762ee7bc5240e87b3cfe5c823b3eac26c33924f169cd30fb101ab187e3ce632`
MD5	`cc037eb2de83aedfa0c7b904bf1bd914`
BLAKE2b-256	`68c64b17b72c972583f6cc03e9382be33040eea0f2bb418b12051f49acbfd273`