Skip to main content

A Task Based Parallelization Framework

Project description

Jug allows you to write code that is broken up into tasks and run different tasks on different processors.

https://travis-ci.com/luispedro/jug.png https://zenodo.org/badge/205237.svg https://anaconda.org/conda-forge/jug/badges/installer/conda.svg https://img.shields.io/badge/CITATION-doi.org%2F10.5334%2Fjors.161-green.svg Join the chat at https://gitter.im/luispedro/jug

It uses the filesystem to communicate between processes and works correctly over NFS, so you can coordinate processes on different machines.

Jug is a pure Python implementation and should work on any platform.

Python 2.6/2.7 and Python 3.5+ are supported.

Website: http://luispedro.org/software/jug

Documentation: https://jug.readthedocs.org/

Video: On vimeo or showmedo

Mailing List: http://groups.google.com/group/jug-users

Testimonials

“I’ve been using jug with great success to distribute the running of a reasonably large set of parameter combinations” - Andreas Longva

Install

You can install Jug with pip:

pip install Jug

or use, if you are using conda, you can install jug from conda-forge using the following commands:

conda config --add channels conda-forge
conda install jug

Citation

If you use Jug to generate results for a scientific publication, please cite

Coelho, L.P., (2017). Jug: Software for Parallel Reproducible Computation in Python. Journal of Open Research Software. 5(1), p.30.

http://doi.org/10.5334/jors.161

Short Example

Here is a one minute example. Save the following to a file called primes.py (if you have installed jug, you can obtain a slightly longer version of this example by running jug demo on the command line):

from jug import TaskGenerator
from time import sleep

@TaskGenerator
def is_prime(n):
    sleep(1.)
    for j in range(2,n-1):
        if (n % j) == 0:
            return False
    return True

primes100 = [is_prime(n) for n in range(2,101)]

This is a brute-force way to find all the prime numbers up to 100. Of course, this is only for didactical purposes, normally you would use a better method. Similarly, the sleep function is so that it does not run too fast. Still, it illustrates the basic functionality of Jug for embarassingly parallel problems.

Type jug status primes.py to get:

Task name                  Waiting       Ready    Finished     Running
----------------------------------------------------------------------
primes.is_prime                  0          99           0           0
......................................................................
Total:                           0          99           0           0

This tells you that you have 99 tasks called primes.is_prime ready to run. So run jug execute primes.py &. You can even run multiple instances in the background (if you have multiple cores, for example). After starting 4 instances and waiting a few seconds, you can check the status again (with jug status primes.py):

Task name                  Waiting       Ready    Finished     Running
----------------------------------------------------------------------
primes.is_prime                  0          63          32           4
......................................................................
Total:                           0          63          32           4

Now you have 32 tasks finished, 4 running, and 63 still ready. Eventually, they will all finish and you can inspect the results with jug shell primes.py. This will give you an ipython shell. The primes100 variable is available, but it is an ugly list of jug.Task objects. To get the actual value, you call the value function:

In [1]: primes100 = value(primes100)

In [2]: primes100[:10]
Out[2]: [True, True, False, True, False, True, False, False, False, True]

What’s New

Version 2.0.2 (Thu Jun 11 2020)

  • Fix command line argument parsing

Version 2.0.1 (Thu Jun 11 2020)

  • Fix handling of JUG_EXIT_IF_FILE_EXISTS environmental variable
  • Fix passing an argument to jug.main() function
  • Extend --pdb to exceptions raised while importing the jugfile (issue #79)

version 2.0.0 (Fri Feb 21 2020)

  • jug.backend.base_store has 1 new method ‘listlocks’
  • jug.backend.base_lock has 2 new methods ‘fail’ and ‘is_failed’
  • Add ‘jug execute –keep-failed’ to preserve locks on failing tasks.
  • Add ‘jug cleanup –failed-only’ to remove locks from failed tasks
  • ‘jug status’ and ‘jug graph’ now display failed tasks
  • Check environmental exit variables by default (suggested by Renato Alves, issue #66)
  • Fix ‘jug sleep-until’ in the presence of barrier() (issue #71)

version 1.6.9 (Tue Aug 6 2019)

  • Fix saving on newer version of numpy

version 1.6.8 (Wed July 10 2019)

  • Add cached_glob() function
  • Fix NoLoad (issue #73)
  • Fix jug shell’s invalidate function with Tasklets (issue #77)

For older version see ChangeLog file.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for Jug, version 2.0.2
Filename, size File type Python version Upload date Hashes
Filename, size Jug-2.0.2.tar.gz (66.1 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page