Skip to main content

Provides a decorator for caching a function between subsequent processes.

Project description

PyPI Package PyPI Downloads PyPI License Python Versions GitHub stars CI status Code Coverage GitHub last commit libraries.io sourcerank Documentation link Checked with Mypy Code style: black

Provides a decorator for caching a function between subsequent processes.

Whenever the function is called with the same arguments, the result is loaded from the cache instead of computed. This cache is persistent across runs. If the arguments, source code, or enclosing environment have changed, the cache recomputes the data transparently (no need for manual invalidation).

The use case is meant for iterative development, especially on scientific experiments. Many times a developer will tweak some of the code but not all. Often, reusing intermediate results saves a significant amount of time every run.

See full documentation here.

Quickstart

If you don’t have pip installed, see the pip install guide.

$ pip install charmonium.cache
>>> from charmonium.cache import memoize
>>> i = 0
>>> @memoize()
... def square(x):
...     print("recomputing")
...     # Imagine a more expensive computation here.
...     return x**2 + i
...
>>> square(4)
recomputing
16
>>> square(4) # no need to recompute
16
>>> i = 1
>>> square(4) # global i changed; must recompute
recomputing
17

Advantages

While there are other libraries and techniques for memoization, I believe this one is unique because it is:

  1. Correct with respect to source-code changes: The cache detects if you edit the source code or change a file which the program reads (provided they use this library’s right file abstraction). Users never need to manually invalidate the cache, so long as the functions are pure (unlike joblib.Memory, Klepto).

    It is precise enough that it will ignore changes in unrelated functions in the file, but it will detect changes in relevant functions in other files. It even detects changes in global variables (as in the example above). See Detecting Changes in Functions for details.

  2. Useful between runs and across machines: The cache can persist on the disk (unlike functools.lru_cache). Moreover, a cache can be shared on the network, so that if any machine has computed the function for the same source-source and arguments, this value can be reused by any other machine, provided your datatype is de/serializable on those platforms.

  3. Easy to adopt: Only requires adding one line (decorator) to the function definition.

  4. Bounded in size: The cache won’t take up too much space. This space is partitioned across all memoized functions according to the heuristic.

  5. Supports smart heuristics: By default, the library uses state-of-the-art cache policies that can take into account time-to-recompute and storage-size in addition to recency, more advanced than simple LRU.

  6. Overhead aware: The library measures the time saved versus overhead. It warns the user if the overhead of caching is not worth it.

Memoize CLI

Make is good for compiling code, but falls short for data science. To get correct results, you have to incorporate every variable your result depends on into a file or part of the filename. If you put it in a file, you only have one version cached at a time; if you put it in the filename, you have to squeeze the variable into a short string. In either case, stale results will accumulate unboundedly, until you run make clean which also purges the fresh results. Finally, it is a significant effor to rewrite shell scripts in make.

memoize makes it easy to memoize steps in shell scripts, correctly. Just add memoize to the start of the line. If the command, its arguments, or its input files change, then command arg1 arg2 ... will be rerun. Otherwise, the output files (including stderr and stdout) will be produced from a prior run. memoize uses ptrace to automatically determine what inputs you depend on and what outputs you produce.

memoize command arg1 arg2
# or
memoize --key=$(date +%Y-%m-%d) -- command arg1 arg2

See CLI for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

charmonium.cache-1.2.11.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

charmonium.cache-1.2.11-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file charmonium.cache-1.2.11.tar.gz.

File metadata

  • Download URL: charmonium.cache-1.2.11.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.10 Linux/5.13.0-40-generic

File hashes

Hashes for charmonium.cache-1.2.11.tar.gz
Algorithm Hash digest
SHA256 c2b6e9c7e81dc4aed886b1ed65e0a348b7a2d08374f33376898aa213f8a19513
MD5 636e419c58f6ed6b6815c9da6d660912
BLAKE2b-256 43e8b3431f6f0bff2e9dcf609a2427610046f95a66f9962777b66f57e981bb2c

See more details on using hashes here.

File details

Details for the file charmonium.cache-1.2.11-py3-none-any.whl.

File metadata

  • Download URL: charmonium.cache-1.2.11-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.10 Linux/5.13.0-40-generic

File hashes

Hashes for charmonium.cache-1.2.11-py3-none-any.whl
Algorithm Hash digest
SHA256 a169902d32a443fde9ddb25444be8f923868ce9f4496f9042432f9ba1a798162
MD5 763102528bbdc498bb99c14f79a25fa8
BLAKE2b-256 f7c1087263ec71f7ef8103d6f116d1bce3d595e0fcfdb7ec7feb7ab6230d0bcc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page