Skip to main content

Durable file-backed caching for JSON-like data with pluggable storage backends

Project description

PyperCache

A Python library providing durable file-backed caching for JSON-like data with pluggable storage backends (pickle, JSON, chunked manifest, SQLite), optional TTL and staleness semantics, read-only query navigation, and append-only request logging.

Installation

pip install pypercache

Or install from source:

git clone https://github.com/BrandonBahret/PyperCache.git
cd PyperCache
pip install .

Quick Start

See the full documentation, examples, and API reference on GitHub:

https://github.com/BrandonBahret/PyperCache/tree/master/docs

Features

  • Pluggable Backends: Choose storage by file extension (.pkl, .json, .manifest, .db)
  • TTL & Staleness: Optional expiry and acceptable staleness windows
  • Typed Objects: Decorate classes for automatic serialization/deserialization, including API key aliases
  • Query Navigation: Safe, read-only JSON path queries with filters
  • Request Logging: Thread-safe JSONL audit trails

API Wrapper

pypercache.api_wrapper.ApiWrapper provides a sync-first base class for building small API clients on top of requests, Cache, and RequestLogger.

from pypercache.api_wrapper import ApiWrapper
from pypercache.models.apimodel import apimodel


@apimodel
class Widget:
    id: int
    name: str


class WidgetClient(ApiWrapper):
    def list_widgets(self) -> list[Widget]:
        return self.request("GET", "/widgets", expected="json", cast=list[Widget])

Testing

pytest

Example

The snippet below demonstrates every major feature in one pass: choosing a backend, TTL, typed objects, query navigation, and request logging.

import math
from datetime import datetime
from typing import Annotated

from pypercache import Cache, RequestLogger
from pypercache.models.apimodel import Alias, Timestamp, apimodel
 
# ── 1. Backend is chosen by file extension ──────────────────────────────────
cache = Cache(filepath="api-cache.db")   # .pkl / .json / .manifest / .db
log   = RequestLogger("api_requests.log")
 
# ── 2. Define a typed model ──────────────────────────────────────────────────
@apimodel
class SearchResult:
    total: int
    next_page: Annotated[str | None, Alias("nextPage")]
    fetched_at: Annotated[datetime, Alias("fetchedAt"), Timestamp()]
    hits:  list
 
# ── 3. Fetch-or-cache pattern ────────────────────────────────────────────────
KEY = "search:v1:python"
 
if not cache.is_data_fresh(KEY):
    payload = {
        "total": 3,
        "nextPage": None,
        "fetchedAt": "2026-04-19T12:34:56Z",
        "hits": [
            {"name": "Alice", "role": "staff",  "score": 92},
            {"name": "Bob",   "role": "guest",  "score": 74},
            {"name": "Carol", "role": "staff",  "score": 88},
        ],
    }
    cache.store(KEY, payload, expiry=3600, cast=SearchResult)
    log.log(uri="/api/search?q=python", status=200)
 
# ── 4. Retrieve a typed object ───────────────────────────────────────────────
result: SearchResult = cache.get_object(KEY)  # SearchResult instance
print(result.total)                           # 3
print(result.next_page)                       # None
print(result.fetched_at.isoformat())          # 2026-04-19T12:34:56+00:00
 
# ── 5. Query without mutating the payload ───────────────────────────────────
q = cache.get(KEY).query
 
print(q.get("total"))                           # 3
print(q.get("hits?role=staff.name"))            # [Alice, Carol]
print(q.get("hits?name*"))                      # ['Alice', 'Bob', 'Carol']
print(q.get("hits?role=staff", select_first=True)["name"])  # 'Alice'
 
# ── 6. Inspect the request log ───────────────────────────────────────────────
for entry in log.get_logs_from_last_seconds(60):
    print(entry.data["uri"], entry.data["status"])

Features

  • Four backends.pkl, .json, .manifest, .db (SQLite with write-behind batching)
  • TTL & staleness — per-record expiry; is_data_fresh tells you whether to re-fetch
  • Typed round-trips@Cache.cached / @apimodel + cast= on store; get_object() on retrieval
  • Query navigation — dotted paths, ?key=value filters, ?key* plucks, ?key existence, select_first, defaults; all in memory over the loaded record
  • Request logging — thread-safe JSONL audit trail with time-window reads

Query navigation

record.query returns a JsonInjester — a lightweight, read-only selector language that runs in memory over the loaded payload. It never touches the storage backend.

q = cache.get("search:v1:python").query

You can also instantiate it directly over any dict:

from pypercache.query import JsonInjester
q = JsonInjester({"meta": {"total": 5}, "hits": [...]})

Path navigation

Dot-separated keys walk the dict. Returns UNSET if any key along the path is absent.

q.get("meta.total")          # 5
q.get("meta.page")           # 1
q.get("meta.missing")        # UNSET
q.has("meta.total")          # True  (shorthand for `get(...) is not UNSET`)

Keys containing hyphens or other non-identifier characters must be wrapped in double quotes inside the selector string:

q.get('"content-type".value')

?key=value — match filter

Returns every element in a list where the key equals the value. A tail path after the operator plucks a field from each matched element.

q.get("hits?role=staff")
# [{"name": "Alice", ...}, {"name": "Carol", ...}]
 
q.get("hits?role=staff.name")
# ["Alice", "Carol"]
 
q.get("hits?team.name=Engineering")
# all dicts where hits[i].team.name == "Engineering"

Prefix the value with # to match numbers instead of strings:

q.get("hits?score=#92")    # integer match
q.get("hits?ratio=#0.75")  # float match

No matches returns an empty list, not UNSET.

?key* — pluck

Extracts a field from every element in the list. Non-missing results are collected; missing ones are silently skipped. Plucks can be chained.

q.get("hits?name*")
# ["Alice", "Bob", "Carol"]
 
q.get("hits?team.name*")
# ["Engineering", "Marketing", "Engineering"]
 
q.get("hits?role*?label*")
# chained: pluck role objects, then pluck label from each

On a dict cursor (rather than a list), pluck navigates to the key and returns its value or UNSET.

?key — exists filter

Does not extract values. On a list cursor, returns only elements that contain the key. On a dict cursor, returns the cursor unchanged if the key is present, or UNSET if absent.

# list cursor — filter to elements that have a "team" key
q.get("hits?team")
 
# dict cursor — gate on key presence
q.get("meta?total")          # returns the meta dict (key exists)
q.get("meta?ghost")          # UNSET
q.get("meta?ghost", default_value=0)  # 0

select_first and default_value

select_first=True unwraps the first element of a list result. Returns UNSET if the list is empty.

from pypercache.query.json_injester import UNSET
 
first = q.get("hits?role=staff", select_first=True)
print(first["name"])   # "Alice"
 
empty = q.get("hits?role=contractor", select_first=True)
print(empty is UNSET)  # True

default_value is returned when the path is missing or resolves to None. Falsy non-None values (False, 0, "") pass through unchanged.

q.get("meta.missing", default_value=0)   # 0
q.get("flags.debug", default_value=False) # False (returned as-is, not default)

cast

When the result is a dict, cast passes it to the given type before returning.

q.get("hits?role=staff", select_first=True, cast=StaffMember)
# StaffMember instance

Known limitations

JsonInjester is intentionally scoped and simple. A few things it does not do:

  • Integer list indexing"hits.0.name" is not supported. Use a filter or pluck to reach list elements.
  • Cross-key queriesrecord.query operates on a single loaded payload. It does not scan multiple records or touch the backend.
  • Non-ASCII keys — unquoted non-ASCII key names raise a parse error. Wrap them in double quotes: '"héros".name'.

For the complete selector reference see QUERY.md.


Documentation

Topic File
Cache, CacheRecord, TTL, typed objects CACHE.md
ApiWrapper, ApiHTTPError, SSEEvent API_WRAPPER.md
@apimodel, aliases, timestamps APIMODEL.md
JsonInjester / record.query selector syntax QUERY.md
Storage backends, RequestLogger, SQLite internals STORAGE.md

Full docs and examples: https://github.com/BrandonBahret/PyperCache/tree/master/docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypercache-0.1.5.tar.gz (71.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pypercache-0.1.5-py3-none-any.whl (51.8 kB view details)

Uploaded Python 3

File details

Details for the file pypercache-0.1.5.tar.gz.

File metadata

  • Download URL: pypercache-0.1.5.tar.gz
  • Upload date:
  • Size: 71.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for pypercache-0.1.5.tar.gz
Algorithm Hash digest
SHA256 438151697d4ab0d4a09f0a2a18cbc148717dc8868553fed37e4a45ad1aeab810
MD5 04198d5d76dc6cfc04f5c5e9072a757d
BLAKE2b-256 dd89d4a51944be6071108be76208f1c21950569d4dbf18ae68bdb0b6895c9469

See more details on using hashes here.

File details

Details for the file pypercache-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: pypercache-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 51.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for pypercache-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 589e9548316d37f82a16ee715b456dfef20f121195d7298aa55021a163abcd9b
MD5 bbe6b3fd5d80cc13156402651865daa9
BLAKE2b-256 8bc0d99ce18c459fc2e1666a99a9fac1ce822ea3c40fe8aa0270818ad56f99bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page