Skip to main content

A simple file-based caching system using hash-based file names

Project description

FileHashCache

FileHashCache is a dead simple, file-based caching library for Python that uses a hashed directory structure and compressed JSON contents for efficient storage and retrieval of cached data.

Features

  • Simple dictionary-like interface
  • File-based storage for persistence
  • Hashed directory structure for efficient file organization
  • Compressed JSON storage for space efficiency
  • Supports any JSON-serializable Python object

Installation

You can install FileHashCache using pip:

pip install filehashcache

Usage

Here's a quick example of how to use FileHashCache:

from filehashcache import FileHashCache

# Create a cache instance
cache = FileHashCache(".cache")

# Store a value
cache["my_key"] = {"name": "John", "age": 30}

# Retrieve a value
data = cache["my_key"]
print(data)  # Output: {'name': 'John', 'age': 30}

# Check if a key exists
if "my_key" in cache:
    print("Key exists!")

# Get a value with a default
value = cache.get("non_existent_key", "default_value")
print(value)  # Output: default_value

# Clear the cache
cache.clear()

# Get the number of items in the cache
print(len(cache))  # Output: 0

API

FileHashCache(root_dir: str = ".cache")

Create a new FileHashCache instance.

  • root_dir: The root directory for storing cached files (default: ".cache")

Methods

  • __setitem__(key: str, value: Any): Set an item in the cache
  • __getitem__(key: str) -> Any: Get an item from the cache
  • __contains__(key: str) -> bool: Check if a key exists in the cache
  • get(key: str, default: Any = None) -> Any: Get an item with a default value
  • clear() -> None: Clear all items from the cache
  • __len__() -> int: Return the number of items in the cache
  • __iter__(): Iterate over all keys in the cache

How it works

FileHashCache uses a two-level directory structure based on the MD5 hash of the cache key. This helps distribute files evenly across directories, improving performance for large numbers of cached items.

Cached values are serialized to JSON, compressed using zlib, and encoded with base64 before being stored on disk. This process is reversed when retrieving cached items.

Performance

In a sample test, FileHashCache demonstrated significant space savings:

  • Raw size: 258.55 MB
  • Cached size: 172.37 MB
  • Compression ratio: 66.67%
  • Space saved: 86.18 MB

This shows that FileHashCache can effectively reduce storage requirements while maintaining fast access to cached data.

Running tests

To run the tests, use the following command:

pytest

License

This project is licensed under the GNU License.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hashstash-0.1.3.tar.gz (169.1 kB view details)

Uploaded Source

Built Distribution

hashstash-0.1.3-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file hashstash-0.1.3.tar.gz.

File metadata

  • Download URL: hashstash-0.1.3.tar.gz
  • Upload date:
  • Size: 169.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for hashstash-0.1.3.tar.gz
Algorithm Hash digest
SHA256 2102b65f6656d048241a399d6f50466effc2c3c1cc81c0a119a0827a0d31f70f
MD5 9833afe155ce52b81d33fe20b06f7388
BLAKE2b-256 c8ab5b6e2b6ddc0dda7f2af9ce1a9fa1c9e17278bfb43af6669028a6092eac24

See more details on using hashes here.

File details

Details for the file hashstash-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: hashstash-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for hashstash-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0e41a5fd84054251d2bdf6d2d6e89948465c72b671702f39c4c45e0d9827820a
MD5 8497e63c2937379cbb9945ef7fa63807
BLAKE2b-256 3fa0a34da1d24fb39a8871afb75dab2c81090f7487a0e19347a013938ab87757

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page