high performance in-memory cache
Project description
Theine
High performance in-memory cache inspired by Caffeine.
- High performance Rust core
- High hit ratio with adaptive W-TinyLFU eviction policy
- Expired data are removed automatically using hierarchical timer wheel
- Fully typed
- Thread-safe with Free-Threading support (tested on Python 3.13.6t)
- Django cache backend
Theine V2 Migration Guide
Theine V2 is a major refactor and rewrite of V1, focused on thread safety and scalability. Below are the key changes:
Cache Class and Memoize Decorator
In V2, the Cache class and Memoize decorator now accept capacity as the first parameter. We have simplified the design by consolidating to a single policy: adaptive W-Tinylfu.
Old:
cache = Cache("tlfu", 10000)
@Memoize(Cache("tlfu", 10000), timedelta(seconds=100))
...
New:
cache = Cache(10000)
@Memoize(10000, timedelta(seconds=100))
...
Making Cache.get return explicit existence status
The Cache.get method now returns a tuple (value, exists) instead of only the value. This makes it explicit whether the requested key was found in the cache. In the old API, a missing key returned None (or a provided default value), which could cause ambiguity if None was also a valid cached value.
Old:
# Without default, returns None if key is missing
v = cache.get("key")
# With default, returns default if key is missing
sentinel = object()
v = cache.get("key", sentinel)
New:
# Returns (value, exists), where exists is True if the key was found
v, ok = cache.get("key")
if ok:
...
Renaming timeout to ttl
The timeout parameter in the Cache class’s set method has been renamed to ttl (Time-to-Live). This is more commonly used in caching and is clearer in meaning. In V1, the term timeout was used for consistency with Django, but ttl is now the preferred naming convention in V2. The Django adapter settings still uses TIMEOUT for compatibility.
Old:
cache.set("key", {"foo": "bar"}, timeout=timedelta(seconds=100))
@Memoize(Cache("tlfu", 10000), timeout=timedelta(seconds=100))
...
New:
cache.set("key", {"foo": "bar"}, ttl=timedelta(seconds=100))
@Memoize(10000, ttl=timedelta(seconds=100))
Thread Safety by Default
In V2, both the Cache class and the Memoize decorator are thread-safe by default. However, if you're not using Theine in a multi-threaded environment, you can disable the locking mechanism. However for free-threaded Python build, nolock will always be False, even if set to True here. This is because Theine internally uses an extra thread for proactive expiry, so at least two threads are active, and thus nolock must remain False.
cache = Cache(10000, nolock=True)
@Memoize(10000, timedelta(seconds=100), nolock=True)
...
In V1, there was a lock parameter used to prevent cache stampede. In V2, this protection is enabled by default, so the lock parameter is no longer required. Setting nolock to True will disable the cache stampede protection, as cache stampede is not possible in a single-threaded environment.
Single Expiration Handling Thread for All Cache Instances
In V2, instead of each cache instance using a separate thread for proactive expiration (as in V1), a single thread will be used to handle expirations for all cache instances via asyncio. This improves efficiency and scalability.
Improved Adaptive Cache Eviction Policy
The improved adaptive cache eviction policy automatically switches between LRU and LFU strategies to achieve a higher hit ratio across diverse workloads. See hit ratios for results based on many widely used trace data.
Table of Contents
Requirements
Python 3.9+
For use with free-threaded Python, the recommended version is Python 3.13.6+. See https://github.com/python/cpython/issues/133136 for more details.
Installation
pip install theine
Design
Theine consists of two main components: a core cache policy implemented in Rust, and a Python interface that manages the storage of cached key-value pairs. The Python layer uses the Rust policy to ensure that the total size of the cache does not exceed its maximum limit and evicts entries based on the policy's decisions.
The Rust core and the Python layer are decoupled through the use of key hashes(int). The core operates solely on key hashes, without any direct knowledge of Python objects. This separation simplifies the core implementation and avoids the complexity of handling Python-specific behavior in Rust.
All interactions with the Rust core are protected by a mutex, meaning there is no concurrency within the core itself. This is an intentional design decision: since Rust is significantly faster than Python, adding concurrency at the core level would introduce unnecessary complexity with little performance gain.
On the Python side, key-value pairs are stored in a sharded dictionary. Each shard has its own mutex (currently a standard threading Lock) to improve scalability. A reader-writer lock is not used because Python does not provide one in the standard library.
Each shard actually maintains two dictionaries:
- The primary key-value dictionary, used to retrieve values directly by key.
- A keyhash-to-key dictionary, used to map the hashes (which are what the Rust core operates on) back to the original keys. This is necessary because when the Rust core decides to evict an item based on its hash, the Python side must be able to identify and remove the corresponding key from the primary dictionary.
API
Key should be a Hashable object, and value can be any Python object.
Cache Client
from theine import Cache
from datetime import timedelta
cache = Cache(10000)
# get value by key, returns (value, exists) where exists indicates if the key was found
v, ok = cache.get("key")
# set with ttl
cache.set("key", {"foo": "bar"}, timedelta(seconds=100))
# delete from cache
cache.delete("key")
# close cache, stop timing wheel thread
cache.close()
# clear cache
cache.clear()
# get current cache stats, please call stats() again if you need updated stats
stats = cache.stats()
print(stats.request_count, stats.hit_count, stats.hit_rate)
# get cache max size
cache.max_size
# get cache current size
len(cache)
Cache Decorator
Theine's Decorator is designed with following:
- Both sync and async support.
- Explicitly control how key is generated. Most remote cache(redis, memcached...) only allow string keys, return a string in key function make it easier when you want to use remote cache later.
- Thundering herd/Cache stampede protection.
- Type checked. Mypy can check key function to make sure it has same input signature as original function and return a hashable.
Theine support hashable keys, so to use a decorator, a function to convert input signatures to hashable is necessary. The recommended way is specifying the function explicitly, this is approach 1, Theine also support generating key automatically, this is approach 2.
- explicit key function
from theine import Cache, Memoize
from datetime import timedelta
@Memoize(10000, timedelta(seconds=100))
def foo(a:int) -> int:
return a
@foo.key
def _(a:int) -> str:
return f"a:{a}"
foo(1)
# asyncio
@Memoize(10000, timedelta(seconds=100))
async def foo_a(a:int) -> int:
return a
@foo_a.key
def _(a:int) -> str:
return f"a:{a}"
await foo_a(1)
# get current cache stats, please call stats() again if you need updated stats
stats = foo.cache_stats()
print(stats.request_count, stats.hit_count, stats.hit_rate)
- auto key function
from theine import Cache, Memoize
from datetime import timedelta
@Memoize(10000, timedelta(seconds=100), typed=True)
def foo(a:int) -> int:
return a
foo(1)
# asyncio
@Memoize(10000, timedelta(seconds=100), typed=True)
async def foo_a(a:int) -> int:
return a
await foo_a(1)
Important: The auto key function use same methods as Python's lru_cache. And there are some issues relatedto the memory usage in this way, take a look this issue or this one.
Django Cache Backend
CACHES = {
"default": {
"BACKEND": "theine.adapters.django.Cache",
"TIMEOUT": 300,
"OPTIONS": {"MAX_ENTRIES": 10000},
},
}
Core Metadata Memory Overhead
The Rust core uses only the key hash, so the actual key size does not affect memory usage. Each metadata entry in Rust consumes 64 bytes of memory.
Benchmarks
hit ratios
Source Code: trace_bench.py
zipf
s3
ds1
oltp
wiki CDN
Twitter Cache
throughput
Source Code: throughput_bench.py
CPU: aliyun, 32 vCPU, Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
Python 3.13.3t
reads=100%, 1-32 threads
Support
Feel free to open an issue or ask question in discussions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file theine-2.0.0.tar.gz.
File metadata
- Download URL: theine-2.0.0.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d31898b008d268147d7a801fddbfc4a9510a131f65c0b181ec843036c167604
|
|
| MD5 |
bd65d4330470e362a2d12b3b6fbfa1ba
|
|
| BLAKE2b-256 |
b504557d1aa6c408f1ad3657c3a8f39aea99fa3a490bf4f814decd24317dea31
|
File details
Details for the file theine-2.0.0-py3-none-any.whl.
File metadata
- Download URL: theine-2.0.0-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fe1b16b4d36fed3eb7de682cd752470da7eef094d212c9db3b0fce342ff5dd7
|
|
| MD5 |
9b73d1a0d823d961efd7962614746095
|
|
| BLAKE2b-256 |
f827ffeff689c2585a2e9ec9b5d2e4e7092d5fbce06a47a15603981a4978dc77
|