Skip to main content

Manage, retrieve, and deduplicate simulation results

Project description

entropic

Simulation-agnostic run cache. Manage, retrieve, and deduplicate simulation results without caring about what your simulation does or how it stores data.

entropic handles the mapping parameters → result file. It doesn't touch what's inside your result files — that's your business.

Install

pip install entropic

Requires Python 3.10+ and TinyDB (installed automatically).

Quickstart

from entropic import Store

store = Store("./results", "./runs.json")

# Define a runner: receives (params, result_path), writes results to result_path
def my_simulation(params, result_path):
    import numpy as np
    data = np.random.randn(params["n"], params["steps"])
    np.save(result_path, data)

# Run or retrieve from cache
record = store.run_or_retrieve(
    params={"n": 100, "steps": 5000, "dt": 0.01},
    runner=my_simulation,
)
print(record.result_path)   # ./results/1769854174.763568_a3f8c1d2e4b6f7a8.npy
print(record.params)         # {"n": 100, "steps": 5000, "dt": 0.01}
print(record.metadata)       # {"elapsed_seconds": 0.042}

# Second call with same params → instant cache hit, no re-run
record = store.run_or_retrieve(
    params={"n": 100, "steps": 5000, "dt": 0.01},
    runner=my_simulation,
)

Core API

Store

store = Store(
    results_dir="./results",     # where result files live
    db_path="./entropic.json",   # TinyDB metadata index
    file_suffix=".h5",           # extension for auto-generated filenames
    index=None,                  # custom IndexBackend (default: TinyDB)
)

store.run_or_retrieve(params, runner, **metadata) → RunRecord

The main workhorse. Returns a cached result if one exists for the given params, otherwise calls runner(params, result_path) and caches the result.

record = store.run_or_retrieve(
    params={"n": 50, "method": "rk4"},
    runner=my_sim,
    git_sha="abc123",  # optional metadata
)

store.run(params, runner, **metadata) → RunRecord

Always runs the simulation, even if a cached result exists. Useful for re-running with the same parameters (e.g., stochastic simulations).

store.retrieve(params) → RunRecord | None

Look up a cached run by exact parameter match. Returns None on cache miss.

store.register(params, result_path, **metadata) → RunRecord

Manually register an externally-produced result file. Use this when you run simulations outside the library and want to index them for later retrieval.

store.register(
    params={"n": 50, "method": "euler"},
    result_path="./results/my_external_run.h5",
)

store.list(where=None) → list[RunRecord]

List all runs, optionally filtered by partial parameter match. This is how you query by a subset of parameters — e.g., all runs with a specific grid size regardless of other settings.

all_runs = store.list()
rk4_runs = store.list(where={"method": "rk4"})
specific = store.list(where={"method": "rk4", "n": 50})

store.delete(params, remove_file=False) → bool

Delete a run record by exact parameter match. Optionally removes the result file from disk.

RunRecord

Frozen dataclass returned by all Store methods.

record.params        # dict — the simulation parameters
record.result_path   # Path — path to the result file
record.params_hash   # str — 16-char hex hash of params
record.created_at    # str — ISO 8601 timestamp
record.metadata      # dict — user-defined extras (elapsed_seconds auto-added)

How it works

Parameters are stored as flat fields in a TinyDB JSON file, plus a deterministic SHA-256 hash for fast exact lookups. This gives you both:

  • O(1) exact match via retrieve() / run_or_retrieve() (hash lookup)
  • Flexible partial queries via list(where=...) (field-by-field TinyDB search)

Parameter hashing normalizes values before hashing: dict keys are sorted, floats are rounded to 12 digits (avoiding IEEE 754 noise), enums are converted to their .value, and everything is serialized to canonical JSON.

Custom index backends

The default TinyDB backend works well for local workflows. For larger-scale use (remote databases, shared teams), implement the IndexBackend protocol:

from entropic.index import IndexBackend
from entropic.record import RunRecord

class PostgresIndex:
    def find_by_hash(self, params_hash: str) -> RunRecord | None: ...
    def find_by_params(self, params: dict) -> list[RunRecord]: ...
    def insert(self, record: RunRecord) -> None: ...
    def all(self) -> list[RunRecord]: ...
    def delete_by_hash(self, params_hash: str) -> bool: ...

store = Store("./results", index=PostgresIndex(conn_string="..."))

Runner contract

A runner is any callable with this signature:

def runner(params: dict[str, Any], result_path: Path) -> None:
    # 1. Use `params` to configure your simulation
    # 2. Write results to `result_path` (any format you want)
    # 3. Return nothing — entropic handles the rest
    ...

The library generates result_path for you (timestamp + hash + suffix). You just write to it.

Development

git clone https://github.com/jpvanegasc/entropic.git
cd entropic
uv sync --group dev
uv run pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entropic-2.0.0a1.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

entropic-2.0.0a1-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file entropic-2.0.0a1.tar.gz.

File metadata

  • Download URL: entropic-2.0.0a1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for entropic-2.0.0a1.tar.gz
Algorithm Hash digest
SHA256 053c924dc535166ed8b38ebdda2febd3c21b5d6b33291f3cb673364ac278d8e3
MD5 c802852f937a4872daef513115866abf
BLAKE2b-256 fb26914ed3806f8ecfbabba615a02b8dbfbb759cf5faf335cbede5ee95c385f1

See more details on using hashes here.

File details

Details for the file entropic-2.0.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for entropic-2.0.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 370b6658c76f83a7144ca86032438592b38dfdc5b499bc62d63e7bd9f4cada3a
MD5 d5dc1a819f9686340255cd1a903b0e16
BLAKE2b-256 fc446ba94347cf30c047c5dd2d1e759f47a9f8326d8fd5d02921981bcd19f8eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page