File system based caching
Project description
fscacher
Caching solution for functions that operate on the file system.
Installation
pip install <path to repository>
Usage
Simple usage:
from fscacher import Cache
cache = Cache(cache_path)
fn = cache.memoize(expensive_fn)
result = fn("arg1", "arg2")
When fn
is called the first time, the function expensive_fn
is evaluated
and its return value is serialized and stored at cache_path
. Subsequent
calls to fn
deserializes the stored result instead of re-evaluating
expensive_fn
.
Optional arguments to memoize
include:
key
: Function with arguments(func, args, kwargs)
and return value of typestr
, used to create the function call signaturedump
: Function with arguments(return_value, filename)
and no return value for serializing the result ofexpensive_fn
load
: Function with arguments(filename, )
for de-serializing the binary data on disk as a return valuedigest
: Function with arguments(stream, )
and return value of typestr
for digesting function call signature as well as the contents of serialized filesprotocol
: Use predefined functions forkey
,dump
,load
anddigest
. A list of known protocol schemes are presented below. If both protocol and explicit functions are set, the explicit functions takes presedence.
If any of the arguments key
, dump
, load
or digest
are set to
"default"
, the functions defined by the cache.defaults
dict are used
instead.
Default key function is constructed as follows:
- Arguments and keyword arguments (both keys and values) are converted to
string by the
str
function. Numpy arrays are converted to lists before conversion. - Any arguments which are too long (> 22 chars) or contain invalid characters
(
= \/:*?"<>|
) are utf-8 encoded and converted to a 64-bit truncated sha256 hash. - Keyword arguments are joined as key-value pairs of the form
k=v
. - If short enough, the key is the function name followed by the space-separated argument list. If too long, the key is the function name followed by the full sha256 hash of the space-separated argument list.
Default dump and load functions are from the python pickle module.
Default digest function is sha256.
Implemented protocols include:
filename/<suffix>
: Return value is interpreted as the name of a temporary file, which should have the suffix<suffix>
, including any leading dot.key
is default key, except that<suffix>
is appended.dump
moves the file to the index location.load
returns the file name as a string.filehash/<suffix>
: Return value is interpreted as the name of a temporary file, which should have the suffix<suffix>
, including any leading dot.key
is default key with the suffix.hash
appended.dump
computes the hash of the temporary file and renames it to the hash value (unless already present). Thereafter, it copies the file (as a hardlink if possible) to the location specified bykey
, except that.hash
is replaced by<suffix>
. Finally, the hash is stored as a string (lowercase hex) at the location specified bykey
(the index location).load
returns the contents of the file at the index location and interprets it as apathlib
path. In the end, this protocol works likefilename/<suffix>
, except that multiple function calls can be mapped to the same file if their return value has equal contents.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for semantic_fscacher-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80020d1f2e9ea79d158c49afe810943ac4f9cac5886ac9f73a0b6bfc95e806f6 |
|
MD5 | fabc24845c9ea58b7436825d1a3b4dc8 |
|
BLAKE2b-256 | d375d79705e6a781ed7d14ccefcd7b383aa671f21d1ef651f9b88e0b264492f5 |