toolcache makes it easy to create and configure caches in memory or on disk

Project description

toolcache

toolcache makes it simple to create and configure caches in python

Features

save caches to memory or to disk
memoize functions, instance methods, @classmethods, and @staticmethods
control cache size with ttl and eviction policies like lru / fifo / lfu
use thread safety, process safety, or no safety (default = thread safety)
use custom hash functions
track cache usage statistics

Install

pip install toolcache

Example Usage
Reference
Frequently Asked Questions

Example Usage

Creating Caches

import toolcache

# memoize function with memory cache
@toolcache.cache('memory')
def f(a, b, c):
    return a * b * c

# memoize function with disk cache, stored in a tempdir
@toolcache.cache('disk')
def f(a, b, c):
    return a * b * c

# memoize function with disk cache, stored in a persistent dir
@toolcache.cache('disk', cache_dir='/path/to/cache/dir')
def f(a, b, c):
    return a * b * c
    
# remove cache entries once they reach a specific age
@toolcache.cache('disk', ttl='24 hours')
def f(a, b, c):
    return a * b * c

# remove cache entries once cache reaches a specific size
@toolcache.cache('disk', max_size=3, max_size_policy='fifo')
def f(a, b, c):
    return a * b * c

# specify which args are used to create unique hash of inputs
@toolcache.cache('disk', hash_args=['a', 'b'])
def f(a, b, c):
    return a * b * c

# create standalone cache
standalone_cache = toolcache.MemoryCache()

Using Caches

# get cache size
print(f.cache.get_cache_size())
> 4

# track cache usage statistics
print(f.cache.stats)
> {'n_checks': 6,
>  'n_deletes': 2,
>  'n_hashes': 8,
>  'n_hits': 2,
>  'n_loads': 1,
>  'n_misses': 4,
>  'n_saves': 3,
>  'n_size_evictions': 0,
>  'n_ttl_evictions': 0}

# clear cache
f.cache.delete_all_entries()

More Examples

Cache Reference

Cache Types

toolcache includes 3 cache types that each inherit from abstract cache class BaseCache:

cachetype	description	use case
`MemoryCache`	cache that saves each entry as key-value pair in a `dict`	speed
`DiskCache`	cache that saves each entry as a file to disk	persistence, or large data that does not fit in memory
`NullCache`	cache that does not save any entries	programmatically disabling cache

Cache Creation

Caches can be created in two ways:

decorating a function with @toolcache.cache(cachetype) where cachetype is 'memory', 'disk', 'null', or a class inheriting from BaseCache
creating a standalone cache by instantiating a class that inherits from BaseCache

Cache Configuration

The configuration options listed below can be passed to toolcache.cache() or passed to a standalone cache during initialization.

General Config

these configuration options are available to every cache

arg	description	example value	default behavior
`safety`	`str` name of concurrency safety level, one of `'thread'`, `'process'`, or `None`	`'thread'`	`'thread'`
`verbose`	`bool` of whether to print info whenever saving to or loading from cache	`False`	`False`
`cache_name`	`bool` of whether to print info whenever saving to or loading from cache	`'important_cache'`	use decorated function name, or uuid for a standalone cache

Hash Config

arg	description	example value	default behavior
`f_hash`	custom function for computing hash	`lambda x: hash(x)`	`toolcache. compute_hash_json()`
`normalize_hash_inputs`	bool of whether to normalize function calls so that for a function `f` with args `a` and `b`, the calls `f(1, 2)` and `f(a=1, b=2)` are equivalent	`False`	`False`
`hash_include_args`	`list` of `str` names of arguments used to compute hash	`['arg1', 'arg2']`	include all args
`hash_exclude_args`	`list` of `str` names of arguments excluded from hash	`['arg3', 'arg4']`	exclude no args

Eviction Config

arg	description	example value	default behavior
`ttl`	`Timelength` of time-to-live maximum age for entries in cache	`'1000s'`	no max age
`max_size`	`int` of max size of cache size	`1000`	no max size
`max_size_policy`	`str` name of eviction policy to use when `max_size` is exceeded, one of `'lru'`, `'fifo'`, or `'lfu'`	`'fifo'`	`'lru''

Statistic Tracking Config

arg	description	example value	default behavior
`track_basic_stats`	`bool` of whether to track basic usage stats	`False`	`False`
`track_detailed_stats`	`bool` of whether to track creations and accesses	`False`	`False`
`track_creation_times`	`bool` of whether to track creation times	`False`	track only if `ttl` is not `None` or `max_size_policy == 'fifo'`
`track_access_times`	`bool` of whether to track access times	`False`	track only if `max_size_policy == 'lru'`
`track_access_counts`	`bool` of whether to track access counts	`False`	track only if `max_size_policy == 'lfu'`

`DiskCache`-specific Config

arg	description	example value	default behavior
`cache_dir`	`str` of directory path to store cache data	`'/path/to/cache_dir'`	create a `tmpdir`
`file_format`	`str` of file format to use for cache data, either `'pickle'` or `'json'`	`'json'`	`'pickle'`
`f_disk_save`	custom function for saving data to disk, function should take `entry_path` and `entry_data` as arguments	`f_save`	save as pickle
`f_disk_load`	custom function for load data from disk, function should take `entry_path` as an argument	`f_load`	load as pickle

Cache Decorators

When using toolcache.cache() to decorate a function, one should consider 1) how function inputs will be hashed, 2) what attributes will be added to the function, and 3) what arguments might be added to the function.

Hashing Function Inputs

To save a function input-output pair within a cache, a unique hash must be taken of the inputs.

Under the default hash configuration, each input arg should either be json-serializable or be a hashable object (i.e. it implements a __hash__() method). By default toolcache uses orjson to create these hashes quickly.

If function inputs do not satisfy these criteria, one or more of the cache config parameters should be used:

parameter	description	example
`f_hash`	provide a custom hash function that takes the same args and kwargs as the decorated function	`@toolcache.cache(..., f_hash=f_custom_hash)`
`hash_include_args`	specify `list` of arg names that should be used to compute hash	`@toolcache.cache(..., hash_include_args=['arg1', 'arg2'])`
`hash_exclude_args`	specify `list` of arg names that should not be used to compute hash	`@toolcache.cache(..., hash_exclude_args=['arg3', 'arg4'])`

toolcache.cache() also works on functions that have *args or **kwargs for inputs

Decorated Function Args

Every time the decorated function is called, it can use the following keyword args to control cache behavior.

kwarg	description	default	example
`cache_save`	`bool` of whether to save output to cache	`True`	`f(..., cache_save=False)` will not save output to cache
`cache_load`	`bool` of whether to attempt to load entry from cache	`True`	`f(..., cache_load=False)` will not attempt to load entry from cache
`cache_verbose`	`bool` of whether to print info about loading from or saving to cache	`True`	`f(..., cache_load=False)` will not attempt to load entry from cache

You can avoid adding these args to the decorated function by using @toolcache.cache(..., add_cache_args=False).

Decorated Function Attributes

The original decorated function can be acessed as f.__wrapped__.

The cache instance associated with a decorated function f() can be accessed using f.cache.

Cache Methods

These methods are available on every cache instance:

method	description
`compute_entry_hash()`	compute hash of entry
`save_entry()`	save entry data to cache
`exists_in_cache()`	return `bool` of whether entry exists in cache
`load_entry()`	load entry data from cache
`get_cache_size()`	return `int` number of items in cache
`delete_entry()`	remove entry from cache
`delete_all_entries()`	delete all entries from cache

Frequently Asked Questions

How is the performance? What is the overhead for using a cache decorator?

To maximize cache performance, one can disable input name normalization (normalize_hash_inputs=False), statistic tracking (track_basic_stats=False and track_detailed_stats=False), and thread safety (safety=None).

On a somewhat modern machine with the above settings, the toolcache.cache() decorator adds about 3 μs to each function call, whereas running a simple function with no cache decorator takes about 50 ns per function call. Using a disk cache instead of a memory cache adds about 25 μs per function call. To truly know whether toolcache is fast enough for your application you may need to run your own benchmarks.

How does `toolcache` relate to other similar projects?

A large motivation for developing toolcache was being able to manage memory-based and disk-based caches with a unified interface and feature set. toolcache is currently the only python package to offer this functionality.

There exist many other python packages for caching and memoization. cacheout and python-memoization both provide in-memory caches with many features. Compared to toolcache these libraries provide a wider variety of cache eviction policies and other interesting features. python-diskcache provides a feature-rich disk-based cache with Django integration and extensive benchmark comparisons to other solutions.

Project details

Release history Release notifications | RSS feed

This version

0.5.0

Jun 29, 2022

0.4.1

Jun 12, 2022

0.4.0

May 8, 2022

0.3.0

Feb 11, 2022

0.2.0

Feb 6, 2022

0.1.0

Feb 13, 2021

0.0.3

Feb 13, 2021

0.0.2

Feb 13, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolcache-0.5.0.tar.gz (27.2 kB view hashes)

Uploaded Jun 29, 2022 Source

Built Distribution

toolcache-0.5.0-py3-none-any.whl (25.2 kB view hashes)

Uploaded Jun 29, 2022 Python 3

Hashes for toolcache-0.5.0.tar.gz

Hashes for toolcache-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`47f99f80294122a738e63df4e770a1323bbfa8e9530872affff4335ccaa71904`
MD5	`f991c5e138d7bbd8a1489571df585c6a`
BLAKE2b-256	`d3cd822a41db54c2bfec19df594250c0d267ba6d25d7a27c5e8f96109b752bbe`

Hashes for toolcache-0.5.0-py3-none-any.whl

Hashes for toolcache-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`442d0393d9025f2c7f3109267fb91304ef5ab0ca0e051d2a4a383e36b6b2e06f`
MD5	`9e332a1fa46b33751e22e4a6f7499d97`
BLAKE2b-256	`b9b38b098329727dc9b2999fe01262a41ea73e80a9d0f39802a8c105f7394ff1`

toolcache 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

toolcache

Features

Install

Contents

Example Usage

Creating Caches

Using Caches

More Examples

Cache Reference

Cache Types

Cache Creation

Cache Configuration

General Config

Hash Config

Eviction Config

Statistic Tracking Config

DiskCache-specific Config

Cache Decorators

Hashing Function Inputs

Decorated Function Args

Decorated Function Attributes

Cache Methods

Frequently Asked Questions

How is the performance? What is the overhead for using a cache decorator?

How does toolcache relate to other similar projects?

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`DiskCache`-specific Config

How does `toolcache` relate to other similar projects?