Skip to main content

Checkpoint

Project description

https://travis-ci.org/mpavan/ediblepickle.png?branch=master

ediblepickle is an Apache v 2.0 licensed checkpointing :target: https://en.wikipedia.org/wiki/Application_checkpointing utility.

The simplest use case is to checkpoint an expensive computation that need not be repeated everytime the program is executed.

import string
from ediblepickle import checkpoint
import time


# A checkpointed expensive function
@checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'), work_dir='/tmp/intermediate_results' refresh=True)
def expensive_computation(m, n, iterations=4, stride=1):
    for i in range(iterations):
        time.sleep(1)
    return range(m, n, stride)


# First call, evaluates the function and saves the results
begin = time.time()
expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

# Second call, since the checkpoint exists, the result is loaded from that file and returned.
begin = time.time()
expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

An important usage feature is to define your own serializers, deserializers to make it human readable. For instance, you can use numpy/scipy utils to save matrices or csv files to debug.

import string
from ediblepickle import checkpoint
import time
from similarity.utils import dict_config


def my_pickler(integers, f):
    print integers
    for i in integers:
        f.write(str(i))
        f.write('\n')


def my_unpickler(f):
    return f.read().split('\n')


@checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'),
            pickler=my_pickler,
            unpickler=my_unpickler,
            refresh=False)
def expensive_computation(m, n, iterations=4, stride=1):
    for i in range(iterations):
        time.sleep(1)
    return range(m, n, stride)


begin = time.time()
print expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

begin = time.time()
print expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

Features

  • Generic Decorator API

  • Checkpoint expensive functions to avoid re-computing while developing programs with several intermediate steps

  • A computation cache for Humans (possible to use human readable keys and serialized data, as opposed to only machine-readable pickle)

  • Specify refresh to flush the cache, and recompute

  • Specify your own serialize/de-serialize (save/load) functions

  • logging; define your own logger to activate logging.

Installation

To install ediblepickle, simply:

$ pip install ediblepickle

Or: .. code-block:: bash

$ easy_install ediblepickle

Examples

Contribute

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.

  2. Fork the repository on GitHub to start making your changes to the master branch (or branch off of it).

  3. Write a test which shows that the bug was fixed or that the feature works as expected.

  4. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to AUTHORS.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ediblepickle-1.0.tar.gz (5.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page