Skip to main content

Distributed machine learning made simple.

Project description

lazycluster

Distributed machine learning made simple.

Getting StartedHighlightsFeaturesAPI DocsSupportReport a BugContribution

lazycluster is a Python library intended to liberate data scientists and machine learning engineers by abstracting away cluster management and configuration so that they are be able to focus on its actual tasks. Especially, the easy and convenient cluster setup with Python for various distributed machine learning frameworks is emphasized.

Highlights

Getting Started

Installation

pip install lazycluster

Usage Example

Prerequisite: Passwordless ssh needs to be setup for the used hosts.

from lazycluster import RuntimeTask, Runtime

# Define a Python function which will be executed remotely
def hello(name:str):
    return 'Hello ' + name + '!'

# Compose a `RuntimeTask`
task = RuntimeTask('my-first_task').run_command('echo Hello World!') \
                                   .run_function(hello, name='World')

# Actually execute it remotely in a `Runtime`                                   
task = Runtime('host-1').execute_task(task, execute_async=False)

# The stdout from from the executing `Runtime` can be accessed via the execution log of teh `RuntimeTask`
task.print_log()

# Print the return of the `hello()` call
generator = task.function_returns
print(next(generator))

Support

The lazycluster project is maintained by Jan Kalkan. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
🚨 Bug Reports
🎁 Feature Requests
👩‍💻 Usage Questions
🗯 General Discussion

Features

Create Runtimes & RuntimeGroups

from lazycluster import Runtime, RuntimeGroup

rt_1 = Runtime('host-1')
rt_2 = Runtime('host-1', root_dir='/workspace')

runtime_group = RuntimeGroup([rt_1, rt_2])
runtime_group = RuntimeGroup(hosts=['host-1', 'host-2'])

Use RuntimeManager to create a RuntimeGroup based on the local ssh config

from lazycluster import RuntimeManager, RuntimeGroup

runtime_group = RuntimeManager().create_group()

Easily launch a DASK cluster

from lazycluster import RuntimeManager
from lazycluster.cluster.dask_cluster import DaskCluster

cluster = DaskCluster(RuntimeManager().create_group())
cluster.start()

Expose a service from or to a Runtime

from lazycluster import Runtime

# Create a Runtime
runtime = Runtime('host-1')

# Make the port 50000 from the Runtime accessible on localhost
runtime.expose_port_from_runtime(50000)

# Make the local port 40000 accessible on the Runtime
runtime.expose_port_to_runtime(40000)

Expose a service to a whole RuntimeGroup or from one contained Runtime in the RuntimeGroup

from lazycluster import RuntimeGroup

# Create a RuntimeGroup
runtime_group = RuntimeGroup('host1', 'host-2', 'host-3')

# Make the local port 50000 accessible on all Runtimes contained in the RuntimeGroup
runtime_group.expose_port_to_runtimes(50000)


# Make the port 40000 which is running on host-1 accessible on all other Runtimes in the RuntimeGroup
runtime_group.expose_port_from_runtime_to_group('host-1', 40000)

Contribution


Licensed Apache 2.0. Created and maintained with ❤️ by developers from SAP in Berlin.

Requirements: ['fabric >= 2.2', 'stormssh', 'cloudpickle', 'distributed', 'psutil']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for lazycluster, version 0.1.1
Filename, size & hash File type Python version Upload date
lazycluster-0.1.1-py3-none-any.whl (36.5 kB) View hashes Wheel py3
lazycluster-0.1.1.tar.gz (30.6 kB) View hashes Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page