Skip to main content

A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation

Project description

checkpointer · License pypi Python 3.12

checkpointer is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.

Adding or removing @checkpoint doesn't change how your code works, and it can be applied to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.

Key Features:

  • 🗂️ Multiple Storage Backends: Built-in support for in-memory and pickle-based storage, or create your own.
  • 🎯 Simple Decorator API: Apply @checkpoint to functions without boilerplate.
  • 🔄 Async and Sync Compatibility: Works with synchronous functions and any Python async runtime (e.g., asyncio, Trio, Curio).
  • ⏲️ Custom Expiration Logic: Automatically invalidate old checkpoints.
  • 📂 Flexible Path Configuration: Control where checkpoints are stored.

Installation

pip install checkpointer

Quick Start 🚀

from checkpointer import checkpoint

@checkpoint
def expensive_function(x: int) -> int:
    print("Computing...")
    return x ** 2

result = expensive_function(4)  # Computes and stores the result
result = expensive_function(4)  # Loads from the cache

How It Works

When you use @checkpoint, the function's arguments (args, kwargs) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, checkpointer loads the cached result instead of recomputing.

Additionally, checkpointer ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:

  1. Its source code: Changes to the function's code update its hash.
  2. Dependent functions: If a function calls others, changes in those dependencies will also update the hash.

Example: Cache Invalidation

def multiply(a, b):
    return a * b

@checkpoint
def helper(x):
    return multiply(x + 1, 2)

@checkpoint
def compute(a, b):
    return helper(a) + helper(b)

If you modify multiply, caches for both helper and compute are invalidated and recomputed.


Parameterization

Custom Configuration

Set up a Checkpointer instance with custom settings, and extend it by calling itself with overrides:

from checkpointer import checkpoint

IS_DEVELOPMENT = True  # Toggle based on your environment

tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT)  # Adds development-specific behavior

Per-Function Customization & Layered Caching

Layer caches by stacking checkpoints:

@checkpoint(format="memory")  # Always use memory storage
@dev_checkpoint  # Adds caching during development
def some_expensive_function():
    print("Performing a time-consuming operation...")
    return sum(i * i for i in range(10**6))
  • In development: Both dev_checkpoint and memory caches are active.
  • In production: Only the memory cache is active.

Usage

Force Recalculation

Force a recalculation and overwrite the stored checkpoint:

result = expensive_function.rerun(4)

Call the Original Function

Use fn to directly call the original, undecorated function:

result = expensive_function.fn(4)

This is especially useful inside recursive functions to avoid redundant caching of intermediate steps while still caching the final result.

Retrieve Stored Checkpoints

Access cached results without recalculating:

stored_result = expensive_function.get(4)

Storage Backends

checkpointer works with both built-in and custom storage backends, so you can use what's provided or roll your own as needed.

Built-In Backends

  1. PickleStorage: Stores checkpoints on disk using Python's pickle.
  2. MemoryStorage: Keeps checkpoints in memory for non-persistent, fast caching.

You can specify a storage backend using either its name ("pickle" or "memory") or its corresponding class (PickleStorage or MemoryStorage) in the format parameter:

from checkpointer import checkpoint, PickleStorage, MemoryStorage

@checkpoint(format="pickle")  # Equivalent to format=PickleStorage
def disk_cached(x: int) -> int:
    return x ** 2

@checkpoint(format="memory")  # Equivalent to format=MemoryStorage
def memory_cached(x: int) -> int:
    return x * 10

Custom Storage Backends

Create a custom storage backend by inheriting from the Storage class and implementing its methods. Access configuration options through the self.checkpointer attribute, an instance of Checkpointer.

Example: Custom Storage Backend

from checkpointer import checkpoint, Storage
from datetime import datetime

class CustomStorage(Storage):
    def exists(self, path) -> bool: ...  # Check if a checkpoint exists at the given path
    def checkpoint_date(self, path) -> datetime: ...  # Return the date the checkpoint was created
    def store(self, path, data): ...  # Save the checkpoint data
    def load(self, path): ...  # Return the checkpoint data
    def delete(self, path): ...  # Delete the checkpoint

@checkpoint(format=CustomStorage)
def custom_cached(x: int):
    return x ** 2

Using a custom backend lets you tailor storage to your application, whether it involves databases, cloud storage, or custom file formats.


Configuration Options ⚙️

Option Type Default Description
format "pickle", "memory", Storage "pickle" Storage backend format.
root_path Path, str, or None User Cache Root directory for storing checkpoints.
when bool True Enable or disable checkpointing.
verbosity 0 or 1 1 Logging verbosity.
path Callable[..., str] None Custom path for checkpoint storage.
should_expire Callable[[datetime], bool] None Custom expiration logic.

Full Example 🛠️

import asyncio
from checkpointer import checkpoint

@checkpoint
def compute_square(n: int) -> int:
    print(f"Computing {n}^2...")
    return n ** 2

@checkpoint(format="memory")
async def async_compute_sum(a: int, b: int) -> int:
    await asyncio.sleep(1)
    return a + b

async def main():
    result1 = compute_square(5)
    print(result1)

    result2 = await async_compute_sum(3, 7)
    print(result2)

    result3 = async_compute_sum.get(3, 7)
    print(result3)

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

checkpointer-2.0.2.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

checkpointer-2.0.2-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file checkpointer-2.0.2.tar.gz.

File metadata

  • Download URL: checkpointer-2.0.2.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for checkpointer-2.0.2.tar.gz
Algorithm Hash digest
SHA256 fd0d246df02d0217254577d5cd13f08b0e4a721c1b91a1455cd52cc4bebb9773
MD5 ff940cf15988377ec35429302bbb0cfd
BLAKE2b-256 9ff5b8fda0b8724202982e08da5378d6fcb09a1101fede7b116d8f33d0985888

See more details on using hashes here.

File details

Details for the file checkpointer-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: checkpointer-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for checkpointer-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 25769107b8b91c40a83bebedbf341a33de6528bdb68cb09316ce3c032162970a
MD5 3d0c357fcb604105ff2d1123d1552345
BLAKE2b-256 8c148fdb40ad7addc59f15a5df20cfa7b078c57bc51cd158f3ed185dd250594f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page