Skip to main content

SwarmSpawner enables JupyterHub to spawn jupyter notebooks across a Docker Swarm cluster

Project description

https://travis-ci.org/rasmunk/SwarmSpawner.svg?branch=master

jhub-SwarmSpawner enables JupyterHub to spawn jupyter notebooks across Docker Swarm cluster

More info about Docker Services here.

Prerequisites

Python version 3.3 or above is required.

Installation

pip install jhub-swarmspawner

Installation from GitHub

git clone https://github.com/rasmunk/SwarmSpawner
cd SwarmSpawner
python setup.py install

Configuration

You can find an example jupyter_config.py inside examples.

The spawner

Docker Engine in Swarm mode and the related services work in a different way compared to Docker containers.

Tell JupyterHub to use SwarmSpawner by adding the following lines to your jupyterhub_config.py:

c.JupyterHub.spawner_class = 'jhub.SwarmSpawner'
c.JupyterHub.hub_ip = '0.0.0.0'
# This should be the name of the jupyterhub service
c.SwarmSpawner.jupyterhub_service_name = 'NameOfTheService'

What is jupyterhub_service_name?

Inside a Docker engine in Swarm mode the services use a name instead of a ip to communicate with each other. ‘jupyterhub_service_name’ is the name of ther service for the JupyterHub.

Networks

It’s important to put the JupyterHub service (also the proxy) and the services that are running jupyter notebook inside the same network, otherwise they couldn’t reach each other. SwarmSpawner use the service’s name instead of the service’s ip, as a consequence JupyterHub and servers should share the same overlay network (network across nodes).

#list of networks
c.SwarmSpawner.networks = ["mynetwork"]

Define the services inside jupyterhub_config.py

You can define container_spec, resource_spec and networks inside jupyterhub_config.py.

Container_spec

The command and args definitions depends on the image that you are using. I.e the command must be possible to execute in the selected image The ‘/usr/local/bin/start-singleuser.sh’ is provided by the jupyter base-notebook The start-singleuser.sh args assumes that the launched image is extended from a version of this.

c.SwarmSpawner.container_spec = {
              # The command to run inside the service
              'args' : ['/usr/local/bin/start-singleuser.sh']
      }

Note: in a container spec, args sets the equivalent of CMD in the Dockerfile, command sets the equivalent of ENTRYPOINT. The notebook server command should not be the ENTRYPOINT, so generally use args, not command, to specify how to launch the notebook server.

See this issue for more info.

Placement

The spawner supports Docker Swarm service placement configurations to be imposed on the spawned services. This includes the option to specify constraints and preferences These can be imposed as a placement policy to all services being spawned. E.g.

c.SwarmSpawner.placement = {
    'constraints': ['node.hostname==worker1'],
    'preferences': ['spread=node.labels.datacenter']
}

Dockerimages

To define which images are available to the users, a list of dockerimages must be declared The individual dictionaries also makes it possible to define whether the image should mount any volumes when it is spawned

# Available docker images the user can spawn
c.SwarmSpawner.dockerimages = [
    {'image': 'jupyter/base-notebook:30f16d52126f',
     'name': 'Minimal python notebook'},
    {'image': 'jupyter/base-notebook:latest',
     'name': 'Image with automatic {replace_me} mount, supports Py2/3 and R,',
     'mounts': mounts}
]

It is also possible to specify individual placement policies for each image. E.g.

# Available docker images the user can spawn
c.SwarmSpawner.dockerimages = [
    {'image': 'jupyter/base-notebook:30f16d52126f',
     'name': 'Minimal python notebook',
     'placement': {'constraint': ['node.hostname==worker1']}},
]

Beyond placement policy, it is also possible to specify a ‘whitelist’ of users who have permission to start a specific image via the ‘access’ key. Such that only mentioned usernames are able to spawn that particular image.

# Available docker images the user can spawn
c.SwarmSpawner.dockerimages = [
    {'image': 'jupyter/base-notebook:30f16d52126f',
     'name': 'Minimal python notebook',
     'access': ['admin']},
]

To make the user able to select between multiple available images, the following must be set. If this is not the case, the user will simply spawn an instance of the default image. i.e. dockerimages[0]

# Before the user can select which image to spawn,
# user_options has to be enabled
c.SwarmSpawner.use_user_options = True

This enables an image select form in the users /hub/home url path when a notebook hasen’t been spawned already.

Bind a Host dir

With 'type':'bind' you mount a local directory of the host inside the container.

Remember that source should exist in the node where you are creating the service.

notebook_dir = os.environ.get('NOTEBOOK_DIR') or '/home/jovyan/work'
c.SwarmSpawner.notebook_dir = notebook_dir
mounts = [{'type' : 'bind',
        'source' : 'MountPointOnTheHost',
        'target' : 'MountPointInsideTheContainer',}]

Volumes

With 'type':'volume' you mount a Docker Volume inside the container. If the volume doesn’t exist it will be created.

mounts = [{'type' : 'volume',
        'source' : 'NameOfTheVolume',
        'target' : 'MountPointInsideTheContainer',}]

Named path

For both types, volume and bind, you can specify a {username} inside the source:

mounts = [{'type' : 'volume',
        'source' : 'jupyterhub-user-{username}',
        'target' : 'MountPointInsideTheContainer',}]

username will be the hashed version of the username.

Mount an anonymous volume

This kind of volume will be removed with the service.

mounts = [{'type' : 'volume',
        'source': '',
        'target' : 'MountPointInsideTheContainer',}]

SSHFS mount

It is also possible to mount a volume that is an sshfs mount to another host supports either passing {id_rsa} or {password} that should be used to authenticate, in addition the typical sshfs flags are supported, defaults to port 22

from jhub.mount import SSHFSMounter

mounts = [SSHFSMounter({
            'type': 'volume',
            'driver_config': 'rasmunk/sshfs:latest',
            'driver_options': {'sshcmd': '{sshcmd}', 'id_rsa': '{id_rsa}',
                               'one_time': 'True',
                               'big_writes': '', 'allow_other': '',
                               'reconnect': '', 'port': '2222'},
            'source': 'sshvolume-user-{username}',
            'target': '/home/jovyan/work'})]

Automatic removal of Volumes

To enact that a volume should be removed when the service is being terminated, there are two options available, either use a anonymous volume as shown above, which will remove the volume when the owning sevice is removed. Or set the default volume label bool flag called keep to false, e.g.

mounts = [{'type' : 'volume',
        'source' : 'jupyterhub-user-{username}',
        'target' : 'MountPointInsideTheContainer',
        'label': {'keep': 'False'}}]

Resource_spec

You can also specify some resource for each service

c.SwarmSpawner.resource_spec = {
                'cpu_limit' : 1000, # (int) – CPU limit in units of 10^9 CPU shares.
                'mem_limit' : int(512 * 1e6), # (int) – Memory limit in Bytes.
                'cpu_reservation' : 1000, # (int) – CPU reservation in units of 10^9 CPU shares.
                'mem_reservation' : int(512 * 1e6), # (int) – Memory reservation in Bytes
                }

Using user_options

There is the possibility to set parameters using user_options

# To use user_options in service creation
c.SwarmSpawner.use_user_options = False

To control the creation of the services you have 2 ways, using jupyterhub_config.py or user_options.

Remember that at the end you are just using the Docker Engine API.

user_options, if used, will overwrite jupyter_config.py for services.

If you set c.SwarmSpawner.use_user_option = True the spawner will use the dict passed through the form or as json body when using the Hub Api.

The spawner expect a dict with these keys:

user_options = {
        'container_spec' : {
                # (string or list) command to run in the image.
                'args' : ['/usr/local/bin/start-singleuser.sh'],
                # name of the image
                'Image' : '',
                'mounts' : mounts,
                'resource_spec' : {
                        # (int) – CPU limit in units of 10^9 CPU shares.
                        'cpu_limit': int(1 * 1e9),
                        # (int) – Memory limit in Bytes.
                        'mem_limit': int(512 * 1e6),
                        # (int) – CPU reservation in units of 10^9 CPU shares.
                        'cpu_reservation': int(1 * 1e9),
                        # (int) – Memory reservation in bytes
                        'mem_reservation': int(512 * 1e6),
                        },
                # dict of constraints
                'placement' : {'constraints': []},
                # list of networks
                'network' : [],
                # name of service
                'name' : ''
                }
        }

Names of the Jupyter notebook service inside Docker engine in Swarm mode

When JupyterHub spawns a new Jupyter notebook server the name of the service will be {service_prefix}-{service_owner}-{service_suffix}

You can change the service_prefix in this way:

Prefix of the service in Docker

c.SwarmSpawner.service_prefix = "jupyterhub"

service_owner is the hexdigest() of the hashed user.name.

In case of named servers (more than one server for user) service_suffix is the name of the server, otherwise is always 1.

Downloading images

Docker Engine in Swarm mode downloads images automatically from the repository. Either the image is available on the remote repository or locally, if not you will get an error.

Because before starting the service you have to complete the download of the image is better to have a longer timeout (default is 30 secs)

c.SwarmSpawner.start_timeout = 60 * 5

You can use all the docker images inside the Jupyter docker-stacks.

Credit

DockerSpawner CassinyioSpawner

License

All code is licensed under the terms of the revised BSD license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jhub-swarmspawner-0.2.4.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

jhub_swarmspawner-0.2.4-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file jhub-swarmspawner-0.2.4.tar.gz.

File metadata

  • Download URL: jhub-swarmspawner-0.2.4.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.1

File hashes

Hashes for jhub-swarmspawner-0.2.4.tar.gz
Algorithm Hash digest
SHA256 09ea3100ba72e9173739e48413e938dcd8873ab00c9b65a0353dd1bcb8f8f6d5
MD5 6d18a182cf12c94a58958a1b15511370
BLAKE2b-256 d70d767ffca906fd5315d47cc9a1ca38726b29b44c98c133f93b71fbff275437

See more details on using hashes here.

File details

Details for the file jhub_swarmspawner-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: jhub_swarmspawner-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.1

File hashes

Hashes for jhub_swarmspawner-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 dec7a3ffbee6e89fc1ee9c5b555bfb4322c26ee699c1f2bd6ac905d63b0878ad
MD5 7d3ea0c94c9f2b4b668c93e996beb71f
BLAKE2b-256 3b437d39a28995cac495c81025794d929ba678bb4576c102570399ffe6b26b1f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page