Skip to main content

Run Python jobs in parallel on the cloud using DISCO

Project description

Dis.co

discomp

Dis.co multi-processing python package

The discomp is a package that distributes computing jobs using the Dis.co service. It introduces an API similar to the multiprocessing python package.

For more information about Dis.co itself, please check out the Dis.co homepage

Overview

class discomp.Process(name, target, args=())

Instantiating the Process class creates a new job with a single task in a 'waiting state' and submits it to Dis.co computing service

name

The job's name. The name is a string used for identification purposes only. It does not have to be unique.

target

The target is the callable object to be invoked by the running job.

args

The args is the argument tuple for the target invocation. By default, no arguments are passed to target.

start()

Start running the job on one of the machines .

This must be called at most once per process object.

join(timeout=None)

Join blocks the calling thread until the job is done. Upon successful completion of the job, results files are downloaded to a new directory, given the job's name within the working directory.

Currently, timeout must always be 'None'.

A process should be joined at most once. A job may be already done by the time join was called. However, the results are downloaded only upon calling join.

class discomp.Pool(processes=None)

Instantiating the Pool class creates an object to be later used to run a job with one or more tasks executed in many machines, by invoking it's map() method. The Pool class does not take any arguments and has a no control on the number of machines used to run the job tasks. The number of machines are determined separately.

map(func, iterable, chunksize=None)

  1. Pool.map applies the same function to many sets of arguments.
  2. It creates a job that runs each set of arguments as a separate task on one of the machines in the "pool".
  3. It blocks until the result is ready (i.e. all job's tasks are done).
  4. The results are returned back in the original order (corresponding to the order of the arguments).
  5. Job related files (in addition to script, input, config files that were used to run the task) are downloaded automatically when the job is done under a directory named as the function's name, within the working directory.
  6. The function's arguments should be provided an iterable.

starmap(func, iterable, chunksize=None)

  1. Pool.starmap is similar to Pool.map but it can apply the same function to many sets of multiple arguments.
  2. The function's arguments should be provided an iterable. Elements of the iterable are expected to be iterables as well that are unpacked as arguments. Hence an iterable of [(1,2), (3, 4)] results in [func(1,2), func(3,4)].

Installation:

  1. Sign-Up in Dis.co dashboard:

    https://app.dis.co/signup

  2. Install discomp package:

 pip install discomp

or

 pip3 install discomp

Usage:

  1. You keep writing your python script as if you were using the multiprocessing package, but instead of importing the process and pool modules from multiprocessing, you import them from discomp.
  2. Setup the environment variables with your Dis.co account's user-name and password (see the examples below).

Examples:

A trivial example using the Process class:

import os
from discomp import Process

os.environ['DISCO_LOGIN_USER'] = 'username@mail.com'
os.environ['DISCO_LOGIN_PASSWORD'] = 'password'

def func(name):
    print ('Hello', name)

p = Process(
    name='MyFirstJobExample',
    target=func,
    args=('Bob',))

p.start()
p.join()

Output:

Process

A basic example using the Pool class:

import os
from discomp import Pool

os.environ['DISCO_LOGIN_USER'] = 'username@mail.com'
os.environ['DISCO_LOGIN_PASSWORD'] = 'password'

def pow3(x):
    print (x**3)
    return (x**3)

p = Pool()
results = p.map(pow3, range(10))
print(results)

Output:

Process

Advanced features

You can add additional configuration for your jobs by using the disco CLI. The CLI is automatically installed when you install discomp. You can configure the cluster, the machine size and the docker image by using

disco config

from the command line

Contact us:

Please feel free to contact us in Dis.co for further information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discomp-1.25.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

discomp-1.25.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file discomp-1.25.0.tar.gz.

File metadata

  • Download URL: discomp-1.25.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for discomp-1.25.0.tar.gz
Algorithm Hash digest
SHA256 62e85d2d59fcc8ebe035b7a0f06dc2e3d2626ec5d567405cdd4237a19a91987a
MD5 e7d6d1c3aaf409da8adc22b9ca75ecf2
BLAKE2b-256 8b508f87ba5c1523dc8e232089a5f823ea09505fa915deb473629617b88a835f

See more details on using hashes here.

File details

Details for the file discomp-1.25.0-py3-none-any.whl.

File metadata

  • Download URL: discomp-1.25.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for discomp-1.25.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb9ba2f0875e89ab79c7fb4092ce479254f42a5889b8ee7039f42b5d1d3ee34f
MD5 915611891c0471a7851f85d2a38d863a
BLAKE2b-256 0902d91b493540764344e41961b593e3eb965a9103b86673dfd9b05343a86865

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page