Skip to main content

Provides a decorator for caching a function and an equivalent command-line util.

Project description

Provides a decorator for caching a function. Whenever the function is called with the same arguments, the result is loaded from the cache instead of computed. If the arguments, source code, or enclosing environment have changed, the cache recomputes the data transparently (no need for manual invalidation).

The use case is meant for iterative development, especially on scientific experiments. Many times a developer will tweak some of the code but not all. Often, reusing prior intermediate computations saves a significant amount of time every run.

Quickstart

If you don’t have pip installed, see the pip install guide. Then run:

$ pip install charmonium.cache
>>> from charmonium.cache import memoize
>>> import shutil; shutil.rmtree(".cache")
>>> i = 0
>>> @memoize()
... def square(x):
...     print("recomputing")
...     return x**2 + i
...
>>> square(4)
recomputing
16
>>> square(4)
16
>>> i = 1
>>> square(4)
recomputing
17

The function must be pure with respect to its arguments and its closure (i part of the closure in the previous example). This library will not detect:

  • Reading directly from the filesystem (this library offers a wrapper over files that permits it to detect changes; use that instead).

  • Non-static references (the caching library can’t detect a dependency if the function references globals()["i"]).

Advantages

While there are other libraries and techniques for memoization, I believe this one is unique because it is:

  1. Correct with respect to source-code changes: The cache detects if you edit the source code or change a file which the program reads (provided they use this library’s right file abstraction). Users never need to manually invalidate the cache, so long as the functions are pure.

  2. Useful between runs and across machines: A cache can be shared on the network, so that if any machine has computed the function for the same source-source and arguments, this value can be reused by any other machine.

  3. Easy to adopt: Only requires adding one line (decorator) to the function definition.

  4. Bounded in size: The cache won’t take up too much space. This space is partitioned across all memoized functions according to the heuristic.

  5. Supports smart heuristics: They can take into account time-to-recompute and storage-size in addition to recency, unlike naive LRU.

  6. Overhead aware: The library measures the time saved versus overhead. It warns the user if the overhead of caching is not worth it.

Memoize CLI

memoize -- command arg1 arg2 ...
memoize memoizes command arg1 arg2 .... If the command, its arguments,

or its input files change, then command arg1 arg2 ... will be rerun. Otherwise, the output files (including stderr and stdout) will be produced from a prior run.

Make is good, but it has a hard time with dependencies that are not files. Many dependencies are not well-contained in files. For example, you may want recompute some command every time some status command returns a different value.

To get correct results you would have to incorporate every key you depend on into the filename, which can be messy, so most people don’t do that. memoize is easier to use correctly, for example:

# `make status=$(status)` will not do the right thing.
make var=1
make var=2 # usually, nothing recompiles here, contrary to user's intent

# `memoize --key=$(status) -- command args` will do the right thing
memoize --key=1 -- command args
memoize --key=2 -- command args # key changed, command is recomptued.

memoize also makes it easy to memoize commands within existing shell scripts.

Code quality

  • The code base is strictly and statically typed with pyright. I export type annotations in accordance with PEP 561; clients will benefit from the type annotations in this library.

  • I have unittests with >95% coverage.

  • I use pylint with few disabled warnings.

  • All of the above methods are incorporated into per-commit continuous-testing and required for merging with the main branch; This way they won’t be easily forgotten.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

charmonium.cache-1.0.0.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

charmonium.cache-1.0.0-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file charmonium.cache-1.0.0.tar.gz.

File metadata

  • Download URL: charmonium.cache-1.0.0.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.4 Linux/5.8.0-50-generic

File hashes

Hashes for charmonium.cache-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7f93106796cf95cef97fae977b6e567a08f9f29e0bbafe1c42d52f014ba8f916
MD5 edced77c570d8732528817ffa49883a0
BLAKE2b-256 42806ea98952fa4a17651267c6322059ca89f13161ef10ee104bba0821e9a375

See more details on using hashes here.

File details

Details for the file charmonium.cache-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: charmonium.cache-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.4 Linux/5.8.0-50-generic

File hashes

Hashes for charmonium.cache-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 82f18fc5b9afbb205b24edb53be2ab509b8877aa74303135e54e0f6d8add7bad
MD5 45cbb70959af96b85838dbfbfec1f44c
BLAKE2b-256 6654b679f9c783a1e9275b1fb998ba2c58cc6796531c6bf5332f59cbbf1d21fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page