Online cache for prompt embeddings using TensorDict memory-mapped storage
Project description
tensordict-cache
A persistent, memory-mapped cache for tensor data built on TensorDict.
Store and retrieve TensorDict objects on disk using memory-mapped files. Cached entries survive process restarts and are loaded lazily without copying data into RAM.
Installation
pip install tensordict-cache
Quick start
import torch
from tensordict import TensorDict
from tensordict_cache import TensorCache
# Create a cache (directory is created if it doesn't exist)
cache = TensorCache("./my_cache")
# Store embeddings keyed by prompt text
cache["hello world"] = TensorDict(
{"embedding": torch.randn(768)},
batch_size=[],
)
# Retrieve them
embedding = cache["hello world"]["embedding"]
Usage
Creating a cache
from tensordict_cache import TensorCache
# Open or create a cache directory
cache = TensorCache("/path/to/cache")
# Open without loading existing entries
cache = TensorCache("/path/to/cache", load_existing=False)
Storing entries
Keys can be strings or integers. Values must be TensorDict instances.
import torch
from tensordict import TensorDict
td = TensorDict({
"hidden_state": torch.randn(512),
"logits": torch.randn(10),
}, batch_size=[])
cache["my_prompt"] = td
cache[42] = td # integer keys work too
Retrieving entries
# Dict-style access (raises KeyError if missing)
result = cache["my_prompt"]
# Safe access with default
result = cache.get("my_prompt") # returns None if missing
result = cache.get("my_prompt", default) # returns default if missing
Checking membership and length
if "my_prompt" in cache:
print("Hit!")
print(f"Cache has {len(cache)} entries")
Listing keys and inspecting the cache
# Keys are SHA-256 hashes of the original key
print(cache.keys())
# Human-readable representation
print(cache)
# TensorCache(prefix=/path/to/cache, n_cache=3)
Deleting entries
# Remove a single entry from memory and disk
del cache["my_prompt"]
# Raises KeyError if the key doesn't exist
Size-limited cache
# Cap total disk usage at 1 GB — oldest entries are evicted first
cache = TensorCache("./my_cache", max_size_bytes=1_000_000_000)
cache["a"] = TensorDict({"x": torch.randn(768)}, batch_size=[])
cache["b"] = TensorDict({"x": torch.randn(768)}, batch_size=[])
# When a new insert exceeds the limit, the oldest entry is removed automatically
Clearing the cache
# Remove all entries from memory and disk
cache.clear()
Persistence across sessions
Data is written to disk as memory-mapped files. Reopening the same directory automatically loads all previously stored entries:
# Session 1
cache = TensorCache("./my_cache")
cache["prompt_a"] = TensorDict({"v": torch.tensor(1.0)}, batch_size=[])
# Session 2 (new process)
cache = TensorCache("./my_cache")
assert "prompt_a" in cache # still there
How it works
Each entry is stored in a subdirectory under the cache prefix:
my_cache/
a1b2c3d4.../ # SHA-256 hash of the key
meta.json
*.memmap
e5f6g7h8.../
meta.json
*.memmap
Under the hood, TensorCache calls TensorDict.memmap()
to write each entry and TensorDict.load_memmap() to read it back. Memory-mapped
storage means tensors are not loaded into RAM until accessed, keeping memory usage
low even for large caches.
API reference
| Method | Description |
|---|---|
TensorCache(prefix, load_existing=True, max_size_bytes=None) |
Create or open a cache at prefix |
cache[key] = td |
Store a TensorDict under key |
cache[key] |
Retrieve a TensorDict (raises KeyError if missing) |
del cache[key] |
Remove entry from memory and disk (raises KeyError if missing) |
cache.get(key, default=None) |
Retrieve or return default |
key in cache |
Check if key exists |
len(cache) |
Number of cached entries |
cache.keys() |
List of hashed key names |
cache.clear() |
Remove all entries from memory and disk |
cache.get_cache_size() |
Total cache size in bytes |
cache.get_cache_size_human() |
Cache size as human-readable string |
cache.key_to_basename(key) |
Get the hashed filename for a key |
Requirements
- Python >= 3.10
- tensordict >= 0.11.0
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tensordict_cache-0.2.1.tar.gz.
File metadata
- Download URL: tensordict_cache-0.2.1.tar.gz
- Upload date:
- Size: 70.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
676ac655e5292de52102403b105c3c16db89808c3de53cc99c4d28c9512d7f4d
|
|
| MD5 |
66798f1b97fc3c8ba7e1d427247720a3
|
|
| BLAKE2b-256 |
61f9c543530d0faea3b9e628d7d3979beb5c34664f8165e12f7906ce9ab7c4d7
|
File details
Details for the file tensordict_cache-0.2.1-py3-none-any.whl.
File metadata
- Download URL: tensordict_cache-0.2.1-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a3d6d9cfe828656dc027246e2c31d5bfd3a2d49f92e0e745996acb9a2d8d44a
|
|
| MD5 |
6315236d93a3e1ee8c27f6bef3fb3997
|
|
| BLAKE2b-256 |
3e74792107a866e3cdcdc6fd8dbdd161013e15b1ee1b315633751c0024b67040
|