Skip to main content

Parallel execution of DVC stages

Project description

zincware PyPI version Discord

paraffin

Paraffin, derived from the Latin phrase parum affinis meaning little related, is a Python package designed to run DVC stages in parallel. While DVC does not currently support this directly, Paraffin provides an effective workaround. For more details, refer to the DVC documentation on parallel stage execution.

[!WARNING] paraffin is still very experimental. Do not use it for production workflows.

Installation

Install Paraffin via pip:

pip install paraffin

Usage

https://github.com/user-attachments/assets/c248e669-7737-450b-9fd7-5b9b8e82605a

paraffin submit

You can submit your current DVC workflow to a database file paraffin.db for later execution.

[!TIP] The paraffin submit command supports globing patterns.

paraffin submit # submit all stages
paraffin submit C_AddNodeNumbers "A*" # select which stages to submit
paraffin submit --help # more information

paraffin worker

A submitted job will be executed by paraffin workers. To start a worker you can run paraffin worker. The worker will pick up all the jobs in the workeres queue and close once finished. You can specify the number of stages a worker should process in parallel by using paraffin worker --jobs <n>. Alternatively, you can start more workers by running the command multiple times.

paraffin worker
paraffin worker --help # more information

paraffin ui

Paraffin ships with a web application for visualizing the progress. You can start it using

paraffin ui
paraffin ui --help # more information

The UI allows you to visualize the progress in real-time, restart jobs and manage workers.

https://github.com/user-attachments/assets/034325fd-7035-434f-9eb8-b47ae4ecbb86

Queue Labels

To fine-tune execution, you can assign stages to specific Celery queues, allowing you to manage execution across different environments or hardware setups. Define queues in a paraffin.yaml file:

queue:
    "B_X*": BQueue
    "A_X_AddNodeNumbers": AQueue

Then, start a worker with specified queues, such as celery (default) and AQueue:

paraffin worker -q AQueue,default

All stages not assigned to a queue in paraffin.yaml will default to the default queue.

[!TIP] If you are building Python-based workflows with DVC, consider trying our other project ZnTrack for a more Pythonic way to define workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paraffin-0.4.0a2.tar.gz (276.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paraffin-0.4.0a2-py3-none-any.whl (166.2 kB view details)

Uploaded Python 3

File details

Details for the file paraffin-0.4.0a2.tar.gz.

File metadata

  • Download URL: paraffin-0.4.0a2.tar.gz
  • Upload date:
  • Size: 276.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.29

File hashes

Hashes for paraffin-0.4.0a2.tar.gz
Algorithm Hash digest
SHA256 25ccb1d5cf30e5384e34c1a37de09f0e45049e156b3bb4a16cd1fb5f6bf8fb82
MD5 0f454e437e6dec1f734543ed52629bac
BLAKE2b-256 925edf0cc4e461f8fcdfce210b24fe8d54e5d105524b3b3f6eb1c949212bd7dd

See more details on using hashes here.

File details

Details for the file paraffin-0.4.0a2-py3-none-any.whl.

File metadata

  • Download URL: paraffin-0.4.0a2-py3-none-any.whl
  • Upload date:
  • Size: 166.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.29

File hashes

Hashes for paraffin-0.4.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd0d028fdf49d179d071e696e914dc6dc7949737709db13d635e6e4a4de094fc
MD5 f6f6264bd935705e2c8a58f43132a481
BLAKE2b-256 3c27553801527623f90de3f0b5d7a6b41f20c9cebab939bcb4bb9a5769b54807

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page