Skip to main content

pipeline library

Project description

image0 image1 image2 image4

Drain is a lightweight framework for writing reproducible data science workflows in Python. The core features are:

  • Turn a Python workflow (DAG) into steps that can be run by a tool like make.

  • Transparently pass the results of one step as the input to another, handling any caching that the user requests using efficient tools like HDF and joblib.

  • Enable easy parallel execution of workflows.

  • Execute only those steps that are determined to be necessary based on timestamps (both source code and data) and dependencies, virtually guaranteeing reproducibility of results and efficient development.

Drain is designed around these principles:

  • Simplicity: drain is very lightweight and easy to use. The core is just a few hundred lines of code. The steps you write in drain get executed with minimal overhead, making drain workflows easy to debug and manage.

  • Reusability: Drain leverages mature tools drake to execute workflows. Drain provides a library of steps for data science workflows including feature generation and selection, model fitting and comparison.

  • Generality: Virtually any workflow can be realized in drain. The core was written with extensibility in mind so new storage backends and job schedulers, for example, will be easy to incorporate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drain-0.0.6.tar.gz (117.5 kB view details)

Uploaded Source

Built Distribution

drain-0.0.6-py2.py3-none-any.whl (49.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file drain-0.0.6.tar.gz.

File metadata

  • Download URL: drain-0.0.6.tar.gz
  • Upload date:
  • Size: 117.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for drain-0.0.6.tar.gz
Algorithm Hash digest
SHA256 7b6dbeaa76d500921f1f8d06f8b80b3bdff765d85e003c59b69d2e86cb0f89c9
MD5 25cda31fb1973552fe584b4572775202
BLAKE2b-256 c2fdb9691408743365989dba62a73386e78e14ea32fecc20e731caa766f28949

See more details on using hashes here.

File details

Details for the file drain-0.0.6-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for drain-0.0.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5f3420bf9c6f42e12d293f7ddc3f5610799446c8cb73bdd9fee726bbe9fe44f6
MD5 8cf0c2755a5604a83bd5a21d7322cb94
BLAKE2b-256 8e45f3be7d94e44439917b3138c7605b77301a10f866fec91ad2e3bc34657f11

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page