A powerful caching library for Python, with TTL support and multiple algorithm options. (https://github.com/lonelyenvoy/python-memoization)
A powerful caching library for Python, with TTL support and multiple algorithm options.
If you like this work, please star it on GitHub.
Why choose this library?
Perhaps you know about
in Python 3, and you may be wondering why I am reinventing the wheel.
Well, actually not. This lib is based on
functools. Please find below the comparison with
|Configurable max size||✔️||✔️|
|Flexible argument typing (typed & untyped)||✔️||Always typed|
|LRU (Least Recently Used) as caching algorithm||✔️||✔️|
|LFU (Least Frequently Used) as caching algorithm||No support||✔️|
|FIFO (First In First Out) as caching algorithm||No support||✔️|
|Extensibility for new caching algorithms||No support||✔️|
|TTL (Time-To-Live) support||No support||✔️|
|Support for unhashable arguments (dict, list, etc.)||No support||✔️|
|Custom cache keys||No support||✔️|
|Partial cache clearing||No support||Pending implementation in v0.3.x|
memoization solves some drawbacks of
lru_cachedoes not support unhashable types, which means function arguments cannot contain dict or list.
>>> from functools import lru_cache >>> @lru_cache() ... def f(x): return x ... >>> f([1, 2]) # unsupported Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list'
lru_cacheis vulnerable to hash collision attack and can be hacked or compromised. Using this technique, attackers can make your program unexpectedly slow by feeding the cached function with certain cleverly designed inputs. However, in
memoization, caching is always typed, which means
f(3.0)will be treated as different calls and cached separately. Also, you can build your own cache key with a unique hashing strategy. These measures prevents the attack from happening (or at least makes it a lot harder).
>>> hash((1,)) 3430019387558 >>> hash(3430019387558.0) # two different arguments have an identical hash value 3430019387558
memoizationis designed to be highly extensible, which make it easy for developers to add and integrate any caching algorithms (beyond FIFO, LRU and LFU) into this library. See Contributing Guidance for further detail.
pip install -U memoization
from memoization import cached @cached def func(arg): ... # do something slow
Simple enough - the results of
func() are cached.
Repetitive calls to
func() with the same arguments run
func() only once, enhancing performance.
:warning:WARNING: for functions with unhashable arguments, the default setting may not enable
memoizationto work properly. See custom cache keys section below for details.
You will learn about the advanced features in the following tutorial, which enable you to customize
Configurable options include
@cached(ttl=5) # the cache expires after 5 seconds def expensive_db_query(user_id): ...
For impure functions, TTL (in second) will be a solution. This will be useful when the function returns resources that is valid only for a short time, e.g. fetching something from databases.
Limited cache capacity
@cached(max_size=128) # the cache holds no more than 128 items def get_a_very_large_object(filename): ...
By default, if you don't specify
max_size, the cache can hold unlimited number of items.
When the cache is fully occupied, the former data will be overwritten by a certain algorithm described below.
Choosing your caching algorithm
from memoization import cached, CachingAlgorithmFlag @cached(max_size=128, algorithm=CachingAlgorithmFlag.LFU) # the cache overwrites items using the LFU algorithm def func(arg): ...
Possible values for
CachingAlgorithmFlag.LRU: Least Recently Used (default)
CachingAlgorithmFlag.LFU: Least Frequently Used
CachingAlgorithmFlag.FIFO: First In First Out
This option is valid only when a
max_size is explicitly specified.
@cached(thread_safe=False) def func(arg): ...
True by default. Setting it to
False enhances performance.
Order-independent cache key
By default, the following function calls will be treated differently and cached twice, which means the cache misses at the second call.
func(a=1, b=1) func(b=1, a=1)
You can avoid this behavior by passing an
order_independent argument to the decorator, although it will slow down the performance a little bit.
@cached(order_independent=True) def func(**kwargs): ...
Custom cache keys
Prior to memorize your function inputs and outputs (i.e. putting them into a cache),
memoization needs to
build a cache key using the inputs, so that the outputs can be retrieved later.
memoizationtries to combine all your function arguments and calculate its hash value using
hash(). If it turns out that parts of your arguments are unhashable,
memoizationwill fall back to turning them into a string using
str(). This behavior relies on the assumption that the string exactly represents the internal state of the arguments, which is true for built-in types.
However, this is not true for all objects. If you pass objects which are
instances of non-built-in classes, sometimes you will need to override the default key-making procedure,
str() function on these objects may not hold the correct information about their states.
Here are some suggestions. Implementations of a valid key maker:
- MUST be a function with the same signature as the cached function.
- MUST produce unique keys, which means two sets of different arguments always map to two different keys.
- MUST produce hashable keys, and a key is comparable with another key (
memoizationonly needs to check for their equality).
- should compute keys efficiently and produce small objects as keys.
def get_employee_id(employee): return employee.id @cached(custom_key_maker=get_employee_id) def calculate_performance(employee): ...
Note that writing a robust key maker function can be challenging in some situations. If you find it difficult, feel free to ask me for help by submitting an issue.
Knowing how well the cache is behaving
>>> @cached ... def f(x): return x ... >>> f.cache_info() CacheInfo(hits=0, misses=0, current_size=0, max_size=None, algorithm=<CachingAlgorithmFlag.LRU: 2>, ttl=None, thread_safe=True, order_independent=False, use_custom_key=False)
cache_info, you can retrieve the number of
misses of the cache, and other information indicating the caching status.
hits: the number of cache hits
misses: the number of cache misses
current_size: the number of items that were cached
max_size: the maximum number of items that can be cached (user-specified)
algorithm: caching algorithm (user-specified)
ttl: Time-To-Live value (user-specified)
thread_safe: whether the cache is thread safe (user-specified)
order_independent: whether the cache is kwarg-order-independent (user-specified)
use_custom_key: whether a custom key maker is used
- Access the original function
- Clear the cache by
This project welcomes contributions from anyone.
- Read Contributing Guidance first.
- Submit bugs and help me verify fixes.
- Submit pull requests for bug fixes and features and discuss existing proposals. Please make sure that your PR passes the tests in
- See contributors of this project.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size memoization-0.3.2-py3-none-any.whl (38.5 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size memoization-0.3.2.tar.gz (31.7 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for memoization-0.3.2-py3-none-any.whl