A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation
Project description
checkpointer ·
checkpointer
is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.
Adding or removing @checkpoint
doesn't change how your code works, and it can be applied to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.
Key Features:
- 🗂️ Multiple Storage Backends: Built-in support for in-memory and pickle-based storage, or create your own.
- 🎯 Simple Decorator API: Apply
@checkpoint
to functions without boilerplate. - 🔄 Async and Sync Compatibility: Works with synchronous functions and any Python async runtime (e.g.,
asyncio
,Trio
,Curio
). - ⏲️ Custom Expiration Logic: Automatically invalidate old checkpoints.
- 📂 Flexible Path Configuration: Control where checkpoints are stored.
Installation
pip install checkpointer
Quick Start 🚀
from checkpointer import checkpoint
@checkpoint
def expensive_function(x: int) -> int:
print("Computing...")
return x ** 2
result = expensive_function(4) # Computes and stores the result
result = expensive_function(4) # Loads from the cache
How It Works
When you use @checkpoint
, the function's arguments (args
, kwargs
) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, checkpointer
loads the cached result instead of recomputing.
Additionally, checkpointer
ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:
- Its source code: Changes to the function's code update its hash.
- Dependent functions: If a function calls others, changes in those dependencies will also update the hash.
Example: Cache Invalidation
def multiply(a, b):
return a * b
@checkpoint
def helper(x):
return multiply(x + 1, 2)
@checkpoint
def compute(a, b):
return helper(a) + helper(b)
If you modify multiply
, caches for both helper
and compute
are invalidated and recomputed.
Parameterization
Custom Configuration
Set up a Checkpointer
instance with custom settings, and extend it by calling itself with overrides:
from checkpointer import checkpoint
IS_DEVELOPMENT = True # Toggle based on your environment
tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT) # Adds development-specific behavior
Per-Function Customization & Layered Caching
Layer caches by stacking checkpoints:
@checkpoint(format="memory") # Always use memory storage
@dev_checkpoint # Adds caching during development
def some_expensive_function():
print("Performing a time-consuming operation...")
return sum(i * i for i in range(10**6))
- In development: Both
dev_checkpoint
andmemory
caches are active. - In production: Only the
memory
cache is active.
Usage
Force Recalculation
Force a recalculation and overwrite the stored checkpoint:
result = expensive_function.rerun(4)
Call the Original Function
Use fn
to directly call the original, undecorated function:
result = expensive_function.fn(4)
This is especially useful inside recursive functions to avoid redundant caching of intermediate steps while still caching the final result.
Retrieve Stored Checkpoints
Access cached results without recalculating:
stored_result = expensive_function.get(4)
Storage Backends
checkpointer
works with both built-in and custom storage backends, so you can use what's provided or roll your own as needed.
Built-In Backends
- PickleStorage: Stores checkpoints on disk using Python's
pickle
. - MemoryStorage: Keeps checkpoints in memory for non-persistent, fast caching.
You can specify a storage backend using either its name ("pickle"
or "memory"
) or its corresponding class (PickleStorage
or MemoryStorage
) in the format
parameter:
from checkpointer import checkpoint, PickleStorage, MemoryStorage
@checkpoint(format="pickle") # Equivalent to format=PickleStorage
def disk_cached(x: int) -> int:
return x ** 2
@checkpoint(format="memory") # Equivalent to format=MemoryStorage
def memory_cached(x: int) -> int:
return x * 10
Custom Storage Backends
Create a custom storage backend by inheriting from the Storage
class and implementing its methods. Access configuration options through the self.checkpointer
attribute, an instance of Checkpointer
.
Example: Custom Storage Backend
from checkpointer import checkpoint, Storage
from datetime import datetime
class CustomStorage(Storage):
def exists(self, path) -> bool: ... # Check if a checkpoint exists at the given path
def checkpoint_date(self, path) -> datetime: ... # Return the date the checkpoint was created
def store(self, path, data): ... # Save the checkpoint data
def load(self, path): ... # Return the checkpoint data
def delete(self, path): ... # Delete the checkpoint
@checkpoint(format=CustomStorage)
def custom_cached(x: int):
return x ** 2
Using a custom backend lets you tailor storage to your application, whether it involves databases, cloud storage, or custom file formats.
Configuration Options ⚙️
Option | Type | Default | Description |
---|---|---|---|
format |
"pickle" , "memory" , Storage |
"pickle" |
Storage backend format. |
root_path |
Path , str , or None |
User Cache | Root directory for storing checkpoints. |
when |
bool |
True |
Enable or disable checkpointing. |
verbosity |
0 or 1 |
1 |
Logging verbosity. |
path |
Callable[..., str] |
None |
Custom path for checkpoint storage. |
should_expire |
Callable[[datetime], bool] |
None |
Custom expiration logic. |
Full Example 🛠️
import asyncio
from checkpointer import checkpoint
@checkpoint
def compute_square(n: int) -> int:
print(f"Computing {n}^2...")
return n ** 2
@checkpoint(format="memory")
async def async_compute_sum(a: int, b: int) -> int:
await asyncio.sleep(1)
return a + b
async def main():
result1 = compute_square(5)
print(result1)
result2 = await async_compute_sum(3, 7)
print(result2)
result3 = async_compute_sum.get(3, 7)
print(result3)
asyncio.run(main())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file checkpointer-2.0.2.tar.gz
.
File metadata
- Download URL: checkpointer-2.0.2.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd0d246df02d0217254577d5cd13f08b0e4a721c1b91a1455cd52cc4bebb9773 |
|
MD5 | ff940cf15988377ec35429302bbb0cfd |
|
BLAKE2b-256 | 9ff5b8fda0b8724202982e08da5378d6fcb09a1101fede7b116d8f33d0985888 |
File details
Details for the file checkpointer-2.0.2-py3-none-any.whl
.
File metadata
- Download URL: checkpointer-2.0.2-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25769107b8b91c40a83bebedbf341a33de6528bdb68cb09316ce3c032162970a |
|
MD5 | 3d0c357fcb604105ff2d1123d1552345 |
|
BLAKE2b-256 | 8c148fdb40ad7addc59f15a5df20cfa7b078c57bc51cd158f3ed185dd250594f |