Skip to main content

Easily stash the results of expensive functions to disk

Project description

Scrat

Persistent Caching of Expensive Function Results

🐿️

Get Started

  1. Install with pip install scrat
  2. Initialize stash scrat init
  3. Start saving time:
import scrat
import time

@scrat.stash()
def expensive_function(param_1):
    time.sleep(3)
    return param_1

expensive_function(1)  # <- function called
expensive_function(1)  # <- function not called the result is recovered from stash
expensive_function(2)  # <- function called again beacuse the parameters changed

Features

  • Seamlessly stores the results of expensive functions to disk for future reuse.
  • Automatically re-evaluates the function if the parameters or function code have changed, ensuring up-to-date results.
  • Saves any result using the pickle.
  • Improved storage of pandas DataFrames, Series, and Numpy arrays.
  • Customizable support for alternative serializers.
  • Flexible parameter hashing mechanism to efficiently handle any parameter type.
  • Command-line interface (CLI) for convenient control and management of the caching functionality.

Similar Projects

lru_cache

Great and fast memoize provided by the standard library functools, unfurtunately results are stored in memory so they can't be reused in different runs.

cachetools

Provides alternatives to lru_cache but it also works in-memory.

Joblib

Joblib is a stablished library that provides great functionality for parallelization and caching. The Memory module provides an excelent alternative to Scrat, but it does have some limitations:

  • Hard to avoid using pickle
  • Lack of options to control the cache size and policies
  • Lack of tools to inspect and cleanup the cache

These are the problems that scrat aims to improve, however, I'd recommend using Joblib in production since it's much more mature than Scrat at the moment.

Concepts

  • Scrat is a famous pre-historic squirrel with some bad luck
  • Stash is composed of a folder where results are saved and a database to index them
  • A Nut is one of the entries in the database
  • The Squirrel is in charge of fetching and stashing the Nuts
  • Serializer dumps results to files and load them back to memory
  • Hasher creates unique hashes for a parameter value
  • HashManager coordinates hashes of all arguments and functon code

Development Setup

  1. Clone this repo
  2. Install pyenv.
  3. Install the python version used for development running pyenv install in the root of this repository.
  4. Install poetry. Version 1.5.1 is recommended.
  5. Run this command to make sure poetry uses the right python version poetry env use $(which python)
  6. Install project and dependencies with poetry install
  7. Run tests with poetry run pytest or activate the virtualenv with poetry shell and then run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrat-0.3.0.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

scrat-0.3.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file scrat-0.3.0.tar.gz.

File metadata

  • Download URL: scrat-0.3.0.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Darwin/22.5.0

File hashes

Hashes for scrat-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e433d0c548fbbc585f5bd54f655571c42792572edf49c0572e362665e0c30dc4
MD5 ac8a6005575ca3530107babe29ae1f67
BLAKE2b-256 443ed23dc47c36891f028cfcd9995be891aba03bab32872b30fb8b54d9312434

See more details on using hashes here.

File details

Details for the file scrat-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: scrat-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Darwin/22.5.0

File hashes

Hashes for scrat-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c39395831672b63f997253295d4ac98770ff12ff7bf4a64828ae671a18046614
MD5 9f4694c83bb7709b92caecd1725068c8
BLAKE2b-256 65eeb33d363bd8351ebcea6a6017065321e1a92b65284be5b2b248afe577abf7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page