Skip to main content

Manage, retrieve, and deduplicate simulation results

Project description

entropic

Entropic is a minimal, file-based run cache for Python-driven simulations and scripts. By hashing your input parameters, it automatically identifies duplicate runs and skips unnecessary computation. It is completely agnostic to your simulation engine, lightweight by design, and built to manage locally run research workflows without getting in your way.

Install

pip install entropic
# or
uv add entropic

Quickstart

from entropic import Store

store = Store("./results", "./runs.json")

# Define a runner: receives (params, result_path), writes results to result_path
def my_simulation(params, result_path):
    import numpy as np
    data = np.random.randn(params["n"], params["steps"])
    np.save(result_path, data)

# Run or retrieve from cache
record = store.run_or_retrieve(
    params={"n": 100, "steps": 5000, "dt": 0.01},
    runner=my_simulation,
)
print(record.result_path)   # ./results/1769854174.763568_a3f8c1d2e4b6f7a8.npy
print(record.params)         # {"n": 100, "steps": 5000, "dt": 0.01}
print(record.metadata)       # {"elapsed_seconds": 0.042}

# Second call with same params → instant cache hit, no re-run
record = store.run_or_retrieve(
    params={"n": 100, "steps": 5000, "dt": 0.01},
    runner=my_simulation,
)

Core API

Store

store = Store(
    results_dir="./results",     # where result files live
    db_path="./entropic.json",   # TinyDB metadata index
    file_suffix=".h5",           # extension for auto-generated filenames
    index=None,                  # custom IndexBackend (default: TinyDB)
)

store.run_or_retrieve(params, runner, **metadata) → RunRecord

The main workhorse. Returns a cached result if one exists for the given params, otherwise calls runner(params, result_path) and caches the result.

record = store.run_or_retrieve(
    params={"n": 50, "method": "rk4"},
    runner=my_sim,
    git_sha="abc123",  # optional metadata
)

store.run(params, runner, **metadata) → RunRecord

Always runs the simulation, even if a cached result exists. Useful for re-running with the same parameters (e.g., stochastic simulations).

store.retrieve(params) → RunRecord | None

Look up a cached run by exact parameter match. Returns None on cache miss.

store.register(params, result_path, **metadata) → RunRecord

Manually register an externally-produced result file. Use this when you run simulations outside the library and want to index them for later retrieval.

store.register(
    params={"n": 50, "method": "euler"},
    result_path="./results/my_external_run.h5",
)

store.list(where=None) → list[RunRecord]

List all runs, optionally filtered by partial parameter match. This is how you query by a subset of parameters — e.g., all runs with a specific grid size regardless of other settings.

all_runs = store.list()
rk4_runs = store.list(where={"method": "rk4"})
specific = store.list(where={"method": "rk4", "n": 50})

store.sweep(params_iter, runner, **metadata) → list[RunRecord]

Run or retrieve results for each parameter set in an iterable. Reuses cached results where possible.

records = store.sweep(
    [{"n": 10, "dt": dt} for dt in [0.01, 0.005, 0.001]],
    runner=my_simulation,
)

store.delete(params, remove_file=False) → bool

Delete a run record by exact parameter match. Optionally removes the result file from disk.

RunRecord

Frozen dataclass returned by all Store methods.

record.params        # dict — the simulation parameters
record.result_path   # Path — path to the result file
record.params_hash   # str — 16-char hex hash of params
record.created_at    # str — ISO 8601 timestamp
record.metadata      # dict — user-defined extras (elapsed_seconds auto-added)

How it works

Parameters are stored as flat fields in a TinyDB JSON file, plus a deterministic SHA-256 hash for fast exact lookups. This gives you both:

  • O(1) exact match via retrieve() / run_or_retrieve() (hash lookup)
  • Flexible partial queries via list(where=...) (field-by-field TinyDB search)

Parameter hashing normalizes values before hashing: dict keys are sorted, floats are rounded to 12 digits (avoiding IEEE 754 noise), enums are converted to their .value, and everything is serialized to canonical JSON.

Custom index backends

The default TinyDB backend works well for local workflows. For larger-scale use (remote databases, shared teams), implement the IndexBackend protocol:

from entropic.index import IndexBackend
from entropic.record import RunRecord

class PostgresIndex:
    def find_by_hash(self, params_hash: str) -> RunRecord | None: ...
    def find_by_params(self, params: dict) -> list[RunRecord]: ...
    def insert(self, record: RunRecord) -> None: ...
    def all(self) -> list[RunRecord]: ...
    def delete_by_hash(self, params_hash: str) -> bool: ...

store = Store("./results", index=PostgresIndex(conn_string="..."))

Runner contract

A runner is any callable with this signature:

def runner(params: dict[str, Any], result_path: Path):
    # 1. Use `params` to configure your simulation
    # 2. Write results to `result_path` (any format you want)
    ...

The library generates result_path for you (timestamp + hash + suffix). You just write to it.

Logging

entropic uses a NullHandler by default (no output). To see what the library is doing:

import logging
logging.getLogger("entropic").addHandler(logging.StreamHandler())
logging.getLogger("entropic").setLevel(logging.INFO)

Development

git clone https://github.com/jpvanegasc/entropic.git
cd entropic
uv sync --group dev
uv run pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entropic-2.0.0.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

entropic-2.0.0-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file entropic-2.0.0.tar.gz.

File metadata

  • Download URL: entropic-2.0.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for entropic-2.0.0.tar.gz
Algorithm Hash digest
SHA256 f26fc8d57affc04bdbdae9c56c3c1467f1068916970717c4b107c8ed83e6d0b6
MD5 a2c1c90345f112ae5ad7e890f569921e
BLAKE2b-256 956c76d4034b5dcb09bc32198c00974c0f886c700194ccdba20ef28f18b2342a

See more details on using hashes here.

File details

Details for the file entropic-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: entropic-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for entropic-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cba203e4d0f3cd58d05ec595bc704e1ba7e4ce1bcd59628c44111a51a38a69d7
MD5 ff3e592a19a624df198642d6b94008c5
BLAKE2b-256 9340e508765025acfa03806853c01a77ff367ff26bae0e846340ebd8cc9bfbf9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page