Skip to main content

A tiny pipeline builder

Project description

chkpt

A tiny pipeline builder

What

chkpt is a zero-dependency, 100-line library that makes it easy to define and execute checkpointed pipelines.

It features...

  • Fluent pipeline construction
  • Transparent caching of expensive operations
  • JSON serialization

How

Defining a Stage

Stages are the atomic units of work in chkpt and correspond to single Python functions. Existing functions need only use a decorator @chkpt.Stage.wrap() to be used as a Stage:

@chkpt.Stage.wrap()
def stage1():
  return "123"

# stage1 is now a Stage instance
assert isinstance(stage1, chkpt.Stage)

# but the original function is still accessible
assert stage1.func() == "123"

Stages can also accept parameters to be provided by other Stages in the final Pipeline:

@chkpt.Stage.wrap()
def stage2(stage1_input):
  return [stage1_input, "456"]

Defining a Pipeline

Pipelines define the excution graph of Stages to be run. Stages are combined with shift operators (<< and >>) to direct the dataflow:

# Each defines a pipeline calculating `stage1` and passing its output to `stage2`.
pipeline = stage1 >> stage2
pipeline = stage2 << stage1
pipeline = stage2 << (stage1,)
pipeline = (stage1,) >> stage2
pipeline = () >> stage1 >> stage2 

More complex pipelines should be defined from the leaves down:

result1 = (stage1, stage2) >> stage3
result2 = (result1, stage1) >> stage4
pipeline = result2 >> stage5

Executing a Pipeline

Pipelines can be directly executed which will use the default config settings:

result = pipeline()

The defaults can be configured by passing a Config instance:

# Will store all stage results and attempt to load already-stored results, if present.
result = pipeline(chkpt.Config(store=True, load=True, dir='/tmp'))

Examples

For detailed usage, see the examples/ directory.

The following is a brief example pipeline:

import chkpt


@chkpt.Stage.wrap()
def make_dataset1():
  ...

@chkpt.Stage.wrap()
def big_download2():
  ...

@chkpt.Stage.wrap()
def work_in_progress_analysis(dataset1, dataset2):
  ...

pipeline = (make_dataset1, big_download2) >> work_in_progress_analysis
# Work-intensive inputs only run once, caching on reruns.
result = pipeline(chkpt.Config(load=[make_dataset1, big_download2]))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chkpt-0.1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

chkpt-0.1.0-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file chkpt-0.1.0.tar.gz.

File metadata

  • Download URL: chkpt-0.1.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for chkpt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4b8553e61698b8452b19e8b2efbb01923fce59c444bf1b9c813d14091d999b8c
MD5 fbcfc2bb41114f18bb1f43add1dafb9b
BLAKE2b-256 9f506ef44e733536d8fe17c12b5b044a208a3886444c3accf87df4fd2e63a3ba

See more details on using hashes here.

File details

Details for the file chkpt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: chkpt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for chkpt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9116898e3d219d6a8064d438e979ed5d919fcce51c6b0e9cc72f85ed7fd3cbb5
MD5 9246fc6bbd00d9522228ff5aea6acd16
BLAKE2b-256 d18d53f4d17aef83020862e9030d175ffc28ef04426fa82d0b3bd283f8ee978b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page