Skip to main content

Parallel execution of DVC stages

Project description

zincware PyPI version

paraffin

Paraffin, derived from the Latin phrase parum affinis meaning little related, is a Python package designed to run DVC stages in parallel. While DVC does not currently support this directly, Paraffin provides an effective workaround. For more details, refer to the DVC documentation on parallel stage execution.

[!WARNING] paraffin is still very experimental. Do not use it for production workflows.

Installation

Install Paraffin via pip:

pip install paraffin

Usage

To use Paraffin, you can run the following to queue up the execution of these DVC stages.

paraffin <stage name> <stage name> ... <stage name>
# run max 4 jobs in parallel
celery -A paraffin.worker worker --loglevel=WARNING --concurrency=4

If you have pip install dash you can also access the dashboard by running

paraffin --dashboard <stage names>

For more information, run:

paraffin --help

Labels

You can run paraffin in multiple processes (e.g. on different hardware with a shared file system). To specify where a stage should run, you can assign labels to each worker.

paraffin --labels GPU # on a GPU node
paraffin --label CPU intel # on a CPU node

To configure the stages you need to create a paraffin.yaml file as follows:

labels:
    GPU_TASK:
        - GPU
    CPU_TASK:
        - CPU
    SPECIAL_CPU_TASK:
        - CPU
        - intel

All stages that are not part of the paraffin.yaml will choose any of the available workers.

[!TIP] If you are building Python-based workflows with DVC, consider trying our other project ZnTrack for a more Pythonic way to define workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paraffin-0.2.0a1.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

paraffin-0.2.0a1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file paraffin-0.2.0a1.tar.gz.

File metadata

  • Download URL: paraffin-0.2.0a1.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/24.0.0

File hashes

Hashes for paraffin-0.2.0a1.tar.gz
Algorithm Hash digest
SHA256 6d59f730489262fe301950c7b751b72da407ce29a3d876a47e015964abe05ab3
MD5 92ccd0cacdd343e0c08a4293b720ccb4
BLAKE2b-256 9015efdefef69ad83a352b4569b2807f3cc5c0892cb228fccc8c489308e0f541

See more details on using hashes here.

File details

Details for the file paraffin-0.2.0a1-py3-none-any.whl.

File metadata

  • Download URL: paraffin-0.2.0a1-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/24.0.0

File hashes

Hashes for paraffin-0.2.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 e0c40488a71c48050ad9d987144f19f82fb0b32eb176326ee44eb5e53b830dd8
MD5 1343b96ea382712ebc1b9576f468db44
BLAKE2b-256 79ed9911a28e68b53d31d684203ff94e41bda6cc5e9480e4b89dfb21ced5d09b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page