Skip to main content

Parallel execution of DVC stages

Project description

zincware PyPI version

paraffin

Paraffin, derived from the Latin phrase parum affinis meaning little related, is a Python package designed to run DVC stages in parallel. While DVC does not currently support this directly, Paraffin provides an effective workaround. For more details, refer to the DVC documentation on parallel stage execution.

[!WARNING] paraffin is still very experimental. Do not use it for production workflows.

Installation

Install Paraffin via pip:

pip install paraffin

Usage

To use Paraffin, you can run the following to queue up the execution of these DVC stages.

paraffin <stage name> <stage name> ... <stage name>
# run max 4 jobs in parallel
celery -A paraffin.worker worker --loglevel=WARNING --concurrency=4

If you have pip install dash you can also access the dashboard by running

paraffin --dashboard <stage names>

For more information, run:

paraffin --help

Labels

You can run paraffin in multiple processes (e.g. on different hardware with a shared file system). To specify where a stage should run, you can assign labels to each worker.

paraffin --labels GPU # on a GPU node
paraffin --label CPU intel # on a CPU node

To configure the stages you need to create a paraffin.yaml file as follows:

labels:
    GPU_TASK:
        - GPU
    CPU_TASK:
        - CPU
    SPECIAL_CPU_TASK:
        - CPU
        - intel

All stages that are not part of the paraffin.yaml will choose any of the available workers.

[!TIP] If you are building Python-based workflows with DVC, consider trying our other project ZnTrack for a more Pythonic way to define workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paraffin-0.2.0a3.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

paraffin-0.2.0a3-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file paraffin-0.2.0a3.tar.gz.

File metadata

  • Download URL: paraffin-0.2.0a3.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/24.0.0

File hashes

Hashes for paraffin-0.2.0a3.tar.gz
Algorithm Hash digest
SHA256 4742f920f2445986ce62e95e3072abad055788066f790d6e592898ed87dee51d
MD5 bb8283ca9eb41a9f657f9aee95c8fcbe
BLAKE2b-256 93961b364eaa775c9cf9181d5fd146b0d276c7bcb57732b58e1f44b36cf29c23

See more details on using hashes here.

File details

Details for the file paraffin-0.2.0a3-py3-none-any.whl.

File metadata

  • Download URL: paraffin-0.2.0a3-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/24.0.0

File hashes

Hashes for paraffin-0.2.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 764f13886a6414c23588631275defe52ffe9721b5f6843aaf2b8a876653cfe52
MD5 4bff298626c030e363057a0c5679e7a3
BLAKE2b-256 7f940fe512d67254ab4e6c76e06f10533ea915cf4ceecd1cf743b6ab152deb67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page