Skip to main content

A lightweight DAG system for data analysis dev ops

Project description

dagdog

Lightweight DAGs for data analysis dev ops. By the time you finish a project with this lil puppy, you'll never want to use a Jupyter notebook again because

  • you will have wasted less time waiting for your notebook to rerun, by iterating rapidly on modular components instead of rerunning your entire analysis after every code update.
  • your code will already be closer to production-ready, as well as more understandable and reliable.

Note that this is NOT a replacement for tools like dagster or prefect. Those tools emphasize production and monitoring, while dagdog is strictly a dev-ops tool, allowing a user to rapidly sketch out an analysis pipeline without worrying about production considerations. This allows proofs-of-concept to grow faster (and fail faster).

Getting started

See the demo. In brief, here's how to work with dagdog:

  • Structure your data analysis pipeline as a collection of python modules, with each module defining exactly one task.
  • Within each module, implement the task using a method named __run__, with no arguments. (You might choose to call __run__ from under if __name__ == "main":, but dagdog does not care about that and will access __run__ directly.)
  • Create a project entrypoint script, imitating demo/project.py to define the execution order of your tasks.
  • Call your project entrypoint, dropping you into an interactive python session, where you can finally call any of the various execution and introspection methods on the dog DAG object.

If working directly on this repo, consider using the simplest-possible virtual environment.

Design goals

  • Easiest-possible usage of a DAG to coordinate execution of a collection of tasks during data analysis dev ops.
  • Flexible commands for execution of the DAG, including running a task in isolation, running only upstream tasks, or running only downstream tasks, etc.
  • All configurations managed natively in python -- users don't need to mess with yaml or json files.
  • Prioritize simplicity above feature-completeness.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagdog-0.1.6.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagdog-0.1.6-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file dagdog-0.1.6.tar.gz.

File metadata

  • Download URL: dagdog-0.1.6.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.8

File hashes

Hashes for dagdog-0.1.6.tar.gz
Algorithm Hash digest
SHA256 439b0d90943408505b2b52327829079a749d8cb641d5ac2025dbe9856123fcee
MD5 b4c8eb4641e72c29806f0ecd40b54b3d
BLAKE2b-256 f78c6d91667b9512ea48d4d13d5a191252401a9ffa9a26fe87b178f25abdb29c

See more details on using hashes here.

File details

Details for the file dagdog-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: dagdog-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.8

File hashes

Hashes for dagdog-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c3393528bc1c9866584fb6e090094d83c86e7e02c272eb5bd25381bfa187b47b
MD5 1cf72c266d0775a2befc30158ec2fc7d
BLAKE2b-256 9bc8a3fac0e08e5bbd5ad79e8bb919609be674cd62b8878b298a56344c0a5b73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page