Skip to main content

Distributed machine learning made simple.

Project description

lazycluster

Distributed machine learning made simple.

Getting StartedHighlightsFeaturesAPI DocsSupportReport a BugContribution

lazycluster is a Python library intended to liberate data scientists and machine learning engineers by abstracting away cluster management and configuration so that they are be able to focus on its actual tasks. Especially, the easy and convenient cluster setup with Python for various distributed machine learning frameworks is emphasized.

Highlights

Getting Started

Installation

pip install lazycluster

Usage Example

Prerequisite: Passwordless ssh needs to be setup for the used hosts.

from lazycluster import RuntimeTask, Runtime

# Define a Python function which will be executed remotely
def hello(name:str):
    return 'Hello ' + name + '!'

# Compose a `RuntimeTask`
task = RuntimeTask('my-first_task').run_command('echo Hello World!') \
                                   .run_function(hello, name='World')

# Actually execute it remotely in a `Runtime`                                   
task = Runtime('host-1').execute_task(task, execute_async=False)

# The stdout from from the executing `Runtime` can be accessed via the execution log of teh `RuntimeTask`
task.print_log()

# Print the return of the `hello()` call
generator = task.function_returns
print(next(generator))

Support

The lazycluster project is maintained by Jan Kalkan. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
🚨 Bug Reports
🎁 Feature Requests
👩‍💻 Usage Questions
🗯 General Discussion

Features

Create Runtimes & RuntimeGroups

from lazycluster import Runtime, RuntimeGroup

rt_1 = Runtime('host-1')
rt_2 = Runtime('host-1', root_dir='/workspace')

runtime_group = RuntimeGroup([rt_1, rt_2])
runtime_group = RuntimeGroup(hosts=['host-1', 'host-2'])

Use RuntimeManager to create a RuntimeGroup based on the local ssh config

from lazycluster import RuntimeManager, RuntimeGroup

runtime_group = RuntimeManager().create_group()

Easily launch a DASK cluster

from lazycluster import RuntimeManager
from lazycluster.cluster.dask_cluster import DaskCluster

cluster = DaskCluster(RuntimeManager().create_group())
cluster.start()

Expose a service from or to a Runtime

from lazycluster import Runtime

# Create a Runtime
runtime = Runtime('host-1')

# Make the port 50000 from the Runtime accessible on localhost
runtime.expose_port_from_runtime(50000)

# Make the local port 40000 accessible on the Runtime
runtime.expose_port_to_runtime(40000)

Expose a service to a whole RuntimeGroup or from one contained Runtime in the RuntimeGroup

from lazycluster import RuntimeGroup

# Create a RuntimeGroup
runtime_group = RuntimeGroup('host1', 'host-2', 'host-3')

# Make the local port 50000 accessible on all Runtimes contained in the RuntimeGroup
runtime_group.expose_port_to_runtimes(50000)


# Make the port 40000 which is running on host-1 accessible on all other Runtimes in the RuntimeGroup
runtime_group.expose_port_from_runtime_to_group('host-1', 40000)

Contribution


Licensed Apache 2.0. Created and maintained with ❤️ by developers from SAP in Berlin.

Requirements: ['fabric >= 2.2', 'stormssh', 'cloudpickle', 'distributed', 'psutil']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazycluster-0.1.1.tar.gz (30.6 kB view hashes)

Uploaded Source

Built Distribution

lazycluster-0.1.1-py3-none-any.whl (36.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page