Skip to main content

Persistent caching for Python functions

Project description

marinate

marinate caches function calls to your disk. This lets you memoize operations across runs of your code. So even if your program terminates, you can run it again without re-invoking slow or expensive functions.

from marinate import marinate

@marinate
def my_slow_fn(input):
    # Function that's slow or expensive

In-memory (transient) caching is also supported.

Features:

  • Uses pickle to store function outputs locally
  • Supports functions with mutable or un-hashable arguments (e.g. dicts, lists, numpy arrays)
  • Thread-safe
  • Supports asynchronous functions
  • Control the cache lifetime and location

Installation

> pip install marinate

Usage

To marinate a function just add the @marinate decorator to it:

from marinate import marinate

@marinate
def my_fn():
    # Does something

If you later modify the function, you can invalidate its cache using the overwrite parameter:

@marinate(overwrite=True)
def my_fn():
    # Does something different

This will force a function call and overwrite whatever's in the cache.

Cache location

By default, cached function calls are stored in the same directory as the files they're defined in. You'll find them in a folder called .marinade.

However, you can change where a function call is stored by setting the cache_dir parameter in the decorator:

@marinate(cache_dir="~/pickle_jar")
def my_fn():
    # ...

After running your program, you'll see pickle (.pkl) files appear in the directory you specified.

You can also specify a cache directory for all marinated functions:

from marinate import set_cache_dir

set_cache_dir("~/pickle_jar")

@marinate
def my_fn():
    # Output will be stored in ~/pickle_jar

Disk vs. RAM

You can also use marinate to cache functions in-memory rather than on-disk. This is preferred if you only care about memoizing operations within a single run of a program, rather than across runs.

@marinate(store="memory")
def my_fn():
    # Do things

In other words, @marinate is a drop-in replacement for Python's built-in @cache decorator.

Limitations

Only certain functions can and should be marinated:

  1. Functions that return an unpickleable object, e.g. sockets or database connections, cannot be cached.
  2. Functions must be pure and deterministic. Meaning they should produce the same output given the same input, and should not have side-effects.
  3. Function arguments must be hashable.
  4. Don't marinate functions that take less than a second. The disk I/O overhead will negate the benefits of caching.
  5. Not all methods in classes should be cached.
  • Also ignore self when generating cache key
  • Make a global enable_cache and disable_cache function
  • .disable and .enable methods on functions
  • .picklejar instead of .marinade
  • Be careful with mutable inputs.
  • Be careful with side-effects.

Authors

Created by Paul Bogdan and Jonathan Shobrook to make our lives easier when iterating on data/training pipelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkld-1.0.0.tar.gz (6.7 kB view details)

Uploaded Source

File details

Details for the file pkld-1.0.0.tar.gz.

File metadata

  • Download URL: pkld-1.0.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.19

File hashes

Hashes for pkld-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2e4bffced71835d381cdd1403bae1df560b4f91592f826be06a5465f9db3db3a
MD5 3fdb191d3a524b7c1178cad82b933698
BLAKE2b-256 e35220c54d49df72a8efafa96b685642b8bbcc8afc5c30f9fbea2134806662ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page