Skip to main content

A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation

Project description

checkpointer · License pypi Python 3.12

checkpointer is a Python library for memoizing function results. It simplifies caching by providing a decorator-based API and supports various storage backends. It's designed for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations. ⚡️

Adding or removing @checkpoint doesn't change how your code works, and it can be applied to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.

Key Features:

  • 🗂️ Multiple Storage Backends: Supports in-memory, pickle, or your own custom storage.
  • 🎯 Simple Decorator API: Apply @checkpoint to functions.
  • 🔄 Async and Sync Compatibility: Works with synchronous functions and any Python async runtime (e.g., asyncio, Trio, Curio).
  • ⏲️ Custom Expiration Logic: Automatically invalidate old checkpoints.
  • 📂 Flexible Path Configuration: Control where checkpoints are stored.

Installation

pip install checkpointer

Quick Start 🚀

from checkpointer import checkpoint

@checkpoint
def expensive_function(x: int) -> int:
    print("Computing...")
    return x ** 2

result = expensive_function(4)  # Computes and stores result
result = expensive_function(4)  # Loads from checkpoint

How It Works

When you use @checkpoint, the function's arguments (args, kwargs) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, checkpointer will return the cached result instead of recomputing.

Additionally, checkpointer ensures that caches are invalidated when a function’s implementation or any of its dependencies change. Each function is assigned a hash based on:

  1. Its source code: Changes to the function’s code update its hash.
  2. Dependent functions: If a function calls others, changes to those will also update the hash.

Example: Cache Invalidation by Function Dependencies

def multiply(a, b):
    return a * b

@checkpoint
def helper(x):
    return multiply(x + 1, 2)

@checkpoint
def compute(a, b):
    return helper(a) + helper(b)

If you change multiply, the checkpoints for both helper and compute will be invalidated and recomputed.


Parameterization

Global Configuration

You can configure a custom Checkpointer:

from checkpointer import checkpoint

checkpoint = checkpoint(format="memory", root_path="/tmp/checkpoints")

Extend this configuration by calling itself again:

extended_checkpoint = checkpoint(format="pickle", verbosity=0)

Per-Function Customization

@checkpoint(format="pickle", verbosity=0)
def my_function(x, y):
    return x + y

Combining Configurations

checkpoint = checkpoint(format="memory", verbosity=1)
quiet_checkpoint = checkpoint(verbosity=0)
pickle_checkpoint = checkpoint(format="pickle", root_path="/tmp/pickle_checkpoints")

@checkpoint
def compute_square(n: int) -> int:
    return n ** 2

@quiet_checkpoint
def compute_quietly(n: int) -> int:
    return n ** 3

@pickle_checkpoint
def compute_sum(a: int, b: int) -> int:
    return a + b

Layered Caching

IS_DEVELOPMENT = True  # Toggle based on environment

dev_checkpoint = checkpoint(when=IS_DEVELOPMENT)

@checkpoint(format="memory")
@dev_checkpoint
def some_expensive_function():
    print("Performing a time-consuming operation...")
    return sum(i * i for i in range(10**6))
  • In development: Both dev_checkpoint and memory caches are active.
  • In production: Only the memory cache is active.

Usage

Force Recalculation

Use rerun to force a recalculation and overwrite the stored checkpoint:

result = expensive_function.rerun(4)

Bypass Checkpointer

Use fn to directly call the original, undecorated function:

result = expensive_function.fn(4)

This is especially useful inside recursive functions. By using .fn within the function itself, you avoid redundant caching of intermediate recursive calls while still caching the final result at the top level.

Retrieve Stored Checkpoints

Access stored results without recalculating:

stored_result = expensive_function.get(4)

Storage Backends

checkpointer supports flexible storage backends, including built-in options and custom implementations.

Built-In Backends

  1. PickleStorage: Saves checkpoints to disk using Python's pickle module.
  2. MemoryStorage: Caches checkpoints in memory for fast, non-persistent use.

To use these backends, pass either "pickle" or PickleStorage (and similarly for "memory" or MemoryStorage) to the format parameter:

from checkpointer import checkpoint, PickleStorage, MemoryStorage

@checkpoint(format="pickle")  # Equivalent to format=PickleStorage
def disk_cached(x: int) -> int:
    return x ** 2

@checkpoint(format="memory")  # Equivalent to format=MemoryStorage
def memory_cached(x: int) -> int:
    return x * 10

Custom Storage Backends

Create custom storage backends by implementing methods for storing, loading, and managing checkpoints. For example, a custom storage backend might use a database, cloud storage, or a specialized format.

Example usage:

from checkpointer import checkpoint, Storage
from typing import Any
from pathlib import Path
from datetime import datetime

class CustomStorage(Storage):  # Optional for type hinting
    @staticmethod
    def exists(path: Path) -> bool: ...
    @staticmethod
    def checkpoint_date(path: Path) -> datetime: ...
    @staticmethod
    def store(path: Path, data: Any) -> None: ...
    @staticmethod
    def load(path: Path) -> Any: ...
    @staticmethod
    def delete(path: Path) -> None: ...

@checkpoint(format=CustomStorage)
def custom_cached(x: int):
    return x ** 2

This flexibility allows you to adapt checkpointer to meet any storage requirement, whether persistent or in-memory.


Configuration Options ⚙️

Option Type Default Description
format "pickle", "memory", Storage "pickle" Storage backend format.
root_path Path, str, or None User Cache Root directory for storing checkpoints.
when bool True Enable or disable checkpointing.
verbosity 0 or 1 1 Logging verbosity.
path Callable[..., str] None Custom path for checkpoint storage.
should_expire Callable[[datetime], bool] None Custom expiration logic.

Full Example 🛠️

import asyncio
from checkpointer import checkpoint

@checkpoint
def compute_square(n: int) -> int:
    print(f"Computing {n}^2...")
    return n ** 2

@checkpoint(format="memory")
async def async_compute_sum(a: int, b: int) -> int:
    await asyncio.sleep(1)
    return a + b

async def main():
    result1 = compute_square(5)
    print(result1)

    result2 = await async_compute_sum(3, 7)
    print(result2)

    result3 = async_compute_sum.get(3, 7)
    print(result3)

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

checkpointer-2.0.1.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

checkpointer-2.0.1-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file checkpointer-2.0.1.tar.gz.

File metadata

  • Download URL: checkpointer-2.0.1.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for checkpointer-2.0.1.tar.gz
Algorithm Hash digest
SHA256 b11769aa6e4248c54cdc2d2dfa4fc546400b2e80e2c1b565976807d0a91df8ee
MD5 7830e43900e8291deb608fbe836bf7c7
BLAKE2b-256 88e19002e47f01d5627d67e235ae6ff93b6ccd9662947693a52c88383da90a55

See more details on using hashes here.

File details

Details for the file checkpointer-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: checkpointer-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for checkpointer-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fc47bf9b17fb0e3f5715264ae7dd0ac60e007ad6629eeb3394228b8a5684ca4b
MD5 75166d76d0309f85e817f3a49f69dd1b
BLAKE2b-256 88114bdfc3926cae94f13d7d78651824a20c76a58ffb21bbb464a80f0079a283

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page