Skip to main content

File-based memoization decorator. Stores the results of expensive function calls and returns the cached result when the same inputs occur again.

Project description

filememo

File-based memoization decorator. Caches the results of expensive function calls. Retains the cached results between program restarts.

CI tests are done in Python 3.8, 3.9 and 3.10 on macOS, Ubuntu and Windows.


The function can be expensive because it is slow, or uses a lot of system resources, or literally makes a request to a paid API.

The memoize decorator returns the cached result when the same function called with the same arguments. Thus, the function is expensive only once and inexpensive thereafter.

For example, the simplest cache for downloaded data can be set like this:

@memoize
def downloaded(url):
    return requests.get(url)
    
downloaded("http://example.net/aaa")  # downloads data
downloaded("http://example.net/bbb")  # downloads data
downloaded("http://example.net/aaa")  # gets data from cache   

Data is saved to the file system using pickledir. Even after the program restart, the cached results will be in place.

# gets data from cache after restart
downloaded("http://example.net/aaa")     

Install

$ pip3 install filememo

Use

from filememo import memoize

@memoize
def long_running_function(a, b, c):
    return compute()

# the following line actually computes the value only
# when the program runs for the first time. On subsequent 
# runs, the value is read from the file
x = long_running_function(1, 2, 3)

Function arguments

The results depend on both the function and its arguments. All results are cached separately.

@memoize
def that_function(a, b, c):
    return compute(a, b, c)

@memoize
def other_function(a, b):
    return compute(a, b)

# the following calls will cache three different values 
y1 = that_function(1, 2, 3)  
y2 = that_function(30, 20, 40)
y3 = other_function(1, 2)

# the way the arguments are set is also important, as is their order. 
# Therefore, the following calls are cached as three different ones
y4 = other_function(1, b=2)
y5 = other_function(a=1, b=2)
y6 = other_function(b=2, a=1)

Cache directory

If dir_path is not specified, the cached data is stored in the directory returned by the gettempdir . However, there is a high probability that the cache stored there will not survive a reboot. And even a certain probability that the system does not have a temporary directory, so the current directory will be considered temporary.

To better control the situation, you can set a specific directory for storing caches.

@memoize(dir_path='/var/tmp/myfuncs')
def function(a, b):
    return a+b
    
# it's ok if different functions share the same directory    
@memoize(dir_path='/var/tmp/myfuncs')
def other_func():
    return compute()

Expiration date

The max_age argument sets two conditions at once:

  • if the result is not yet in the cache (and we will add it now), then it will live in the cache no longer than max_age. After that it will be automatically deleted
  • if the result is already in the cache, then we only use it if its age is less than max_age. Otherwise, the function will be run again, and the result will be replaced with a new one
@memoize(max_age = datetime.timedelta(minutes=5))
def function(a, b):
    return compute()

Data version

When you specify version, all results with different versions are considered outdated.

Say you have the following function:

@memoize(version=1)
def function(a, b):
    return a + b

You changed your mind, and now the function should return the product of numbers instead of the sum. But the cache already contains the previous results with the sums. In this case, you can just change version. Previous results will not be returned.

@memoize(version=2)
def function(a, b):
    return a * b

Note that all other than the current version are deprecated, regardless of whether their value is greater or less. If you used version=10, and then started using version=9, then 9 is considered current, and 10 is obsolete.

Exceptions

If the decorated function throws an exception, the error is considered permanent. The exception is stored in the cache and will be raised every time.

from filememo import memoize, FunctionException

@memoize
def divide(a, b):
    return a / b

try:
    # tryng to run the function for the first time
    divide(1, 0)
except FunctionException as e:
    print(f"Error: {e.inner}")      

try:
    # not actually running again, getting error from cache
    divide(1, 0)
except FunctionException as e:
    print(f"Cached error: {e.inner}")      

The exceptions_max_age = None argument will prevent exceptions from being cached. Each error will be considered a one-time error.

@memoize(exceptions_max_age = None)
def download(url):
    return http_get(url)
    
while True:
    try:
        download('http://sample.net/path')
        break
    except FunctionException:
        time.sleep(1)
        # will retry        

You can also set the expiration time for cached exceptions. It may differ from the caching time of the data itself.

# keep downloaded data for a day, remember connection errors for 5 minutes

@memoize(max_age = datetime.timedelta(days: 1)
         exceptions_max_age = datetime.timedelta(minutes: 5))
def download(url):
    return http_get(url)

In-memory caching

Each call to a function decorated with @memoize results in I/O operations. If your absolute priority is performance, then even reading from the disk cache can be considered expensive. Although filememo does not attempt to cache the read data in memory, this functionality is easy to achieve by combining decorators.

from functools import lru_cache
from filememo import memoize

@lru_cache
@memoize
def too_expensive():
    return compute()

In this example, the filememo disk cache will be used to store the results between program runs, while the functools RAM cache will store the results between function calls.

If the data is already in disk cache, and the program is just started, then calling too_expensive() for the first time will read the result from disk. Further calls to too_expensive() will return the result from memory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filememo-0.3.4.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

filememo-0.3.4-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file filememo-0.3.4.tar.gz.

File metadata

  • Download URL: filememo-0.3.4.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for filememo-0.3.4.tar.gz
Algorithm Hash digest
SHA256 5982530cb8c2cbc7c741e2a218364ed7bf3ebd0254559a0440672d6333ce6628
MD5 bfd186eb8ce6d2dabdaddc4e0540d345
BLAKE2b-256 50577bea0a2eadbfad7a34ce88aab83cd9983b1c4290b46a8d4ef36f7e305612

See more details on using hashes here.

File details

Details for the file filememo-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: filememo-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for filememo-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 49ecdbe7f54a1ffe1f6b18ea9ff882b81a10fd5dde1eef83549e30e9bd18582a
MD5 36ca241eb7491644108666ddeef24079
BLAKE2b-256 d5f3d60875ca3cb7e01d78b8a5b090df8a5a498d519e651e95a950395d22d1c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page