Skip to main content

Directory-level cache with pluggable eviction policies and storage backends

Project description

File Cache

Directory-level cache with pluggable eviction policies and storage backends.

Why

Large-scale engineering simulations produce files that are expensive to parse and process. When a server extracts visualization data or structural metadata from these files, caching the results on disk avoids redundant computation — especially when multiple users access the same data. The extracted data consists of large numerical arrays that are not practical to store in a database, so a directory-level file cache is the natural fit. A pluggable storage backend allows the same cache to work on local disk (on-premise) or S3 (cloud).

Overview

vcti-file-cache manages a collection of cached model directories. Each directory is tracked via lightweight metadata (access count and size). A background thread periodically computes directory sizes and evicts directories when the total exceeds a configured threshold, using a pluggable eviction policy (LRU, LFU, or size-weighted LRU).

The cache and model directories are accessed through context managers that handle directory creation, access tracking, and cleanup automatically. Storage is abstracted behind a CacheStorage protocol — the built-in LocalStorage backend uses the local filesystem; other backends (e.g. S3) can be added by implementing the same protocol. No external dependencies.

Installation

pip install vcti-file-cache>=1.0.0

Quick Start

from pathlib import Path
from vcti.filecache import FileCache, CacheManagerConfig

config = CacheManagerConfig(evict_above=50 * 1024**3)  # 50 GB

# root= is shorthand for LocalStorage
with FileCache(root=Path("/data/cache"), config=config) as cache:

    # model() returns a context manager — creates the directory,
    # tracks access count and timestamps automatically
    with cache.model("ab/cd/model_abc123") as model:
        save_mesh(Path(model.path) / "mesh.vtu")
        save_result(Path(model.path) / "result.h5")

    # Same call for read access — access time updates automatically
    with cache.model("ab/cd/model_abc123") as model:
        data = load(Path(model.path) / "result.h5")

    # Manual eviction (background thread also runs automatically)
    cache.evict()

Explicit storage backend

from vcti.filecache import FileCache, CacheManagerConfig, LocalStorage

storage = LocalStorage(root=Path("/data/cache"))
config = CacheManagerConfig(evict_above=50 * 1024**3)

with FileCache(storage=storage, config=config) as cache:
    ...

Custom eviction policy

from vcti.filecache import FileCache, CacheManagerConfig, LfuPolicy

config = CacheManagerConfig(
    evict_above=50 * 1024**3,
    policy=LfuPolicy(),   # evict least frequently accessed first
    interval=120.0,        # background cycle every 2 minutes
)

with FileCache(root=Path("/data/cache"), config=config) as cache:
    ...

Observer for lifecycle events

from vcti.filecache import LoggingObserver, CacheManagerConfig, FileCache

# Built-in LoggingObserver uses stdlib logging
config = CacheManagerConfig(
    evict_above=50 * 1024**3,
    observer=LoggingObserver(),
)

with FileCache(root=Path("/data/cache"), config=config) as cache:
    ...

Or subclass CacheObserver for custom behavior:

from vcti.filecache import CacheObserver

class MyObserver(CacheObserver):
    def on_evict(self, path: str) -> None:
        print(f"Evicted: {path}")

    def on_error(self, error: Exception, context: str) -> None:
        print(f"Error in {context}: {error}")

Core API

FileCache

Parameter Description
config CacheManagerConfig (required)
storage CacheStorage backend (or use root shorthand)
root Path for local storage (creates LocalStorage internally)
Method/Property Description
model(path) Context manager for a model directory
entries() Metadata for all cached model directories
snapshot() Scan once, return a CacheSnapshot with all properties
evict(policy) Manually evict directories until within budget
storage The CacheStorage backend in use
total_size Sum of all entry sizes (triggers scan)
entry_count Number of cached model directories (triggers scan)
is_over_limit Whether total exceeds eviction threshold (triggers scan)
overage Bytes over the threshold (triggers scan)

CacheManagerConfig

Attribute Default Description
evict_above (required) Size threshold in bytes for eviction
policy LruPolicy() Eviction policy
interval 60.0 Seconds between background cycles
observer None CacheObserver for lifecycle events

CacheStorage

Protocol for storage backends. Implement this to add a new backend. LocalStorage is the built-in implementation for local filesystem.

ModelDir

Property/Method Description
path Full path or URI to the model directory
exists Whether the directory exists

Eviction Policies

Policy Strategy
LruPolicy Oldest access time first
LfuPolicy Least frequently accessed first
SizedLruPolicy Oldest + largest first (faster space recovery)

CacheManager

Starts automatically when a FileCache enters its context. Runs as a daemon thread that periodically scans model directories, updates their sizes, and evicts when total size exceeds the threshold. Singleton per storage root — multiple FileCache instances sharing the same storage share one manager.


Dependencies

None. Standard library only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_file_cache-1.0.0.tar.gz (20.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_file_cache-1.0.0-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file vcti_file_cache-1.0.0.tar.gz.

File metadata

  • Download URL: vcti_file_cache-1.0.0.tar.gz
  • Upload date:
  • Size: 20.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vcti_file_cache-1.0.0.tar.gz
Algorithm Hash digest
SHA256 cc1511305a594d2dfc2dec95f3bb3908b9b635e8504e503975ab9349e49e9065
MD5 32effbe348730020f1d8af6a9e9c7cd5
BLAKE2b-256 0e10b7ba50bc731a33f7b77d7e02f9ec7991c5c3cbf79c5f62d467a72cb255f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_file_cache-1.0.0.tar.gz:

Publisher: publish.yml on vcollab/vcti-python-file-cache

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_file_cache-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vcti_file_cache-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b8738c829e03caef8e679da9197a6a479435d2bc608097e41a6680a6cc1f9f01
MD5 a075bb5f8909f2b7a0d3f119fdd91fc6
BLAKE2b-256 96d43f4123bd25221360ffc3e9b0ffb3ff5b3db49d21ef326137e451b61d4424

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_file_cache-1.0.0-py3-none-any.whl:

Publisher: publish.yml on vcollab/vcti-python-file-cache

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page