Automatic, dependency-aware memoization for Python — a modern pure-Python reimplementation of IncPy (Guo & Engler, ISSTA 2011).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

puppyum

These details have not been verified by PyPI

Project links

Companion site

Project description

rote

Automatic, dependency-aware memoization for Python research scripts. No interpreter fork, no decorators required.

rote is a pure-Python reimplementation of IncPy (Guo & Engler, ISSTA 2011) on contemporary CPython (≥3.12). Same goal as the original: observe a script at runtime, find the function calls that are pure and long-running, and persist their results across runs. The implementation is new, built on sys.monitoring (PEP 669) and audit hooks (PEP 578), so no patched interpreter is needed.

There's a companion site that walks through the design, the speedups, and where rote diverges from the paper. If you're reading this for the first time, start there.

Why

You change one line in analyze.py, save, re-run. Plain Python re-does the 90 seconds of feature extraction, the 30 seconds of model training, and the 2 seconds of plotting, all to look at one tweaked plot. That re-work is what IncPy was built to remove in 2011. It's still the problem.

Install

rote isn't on PyPI yet, so install from source for now:

# Plain pip
pip install "git+https://github.com/puppyum/rote.git"
pip install "rote[all] @ git+https://github.com/puppyum/rote.git"   # plus pyarrow, numpy, safetensors

# uv (recommended for research workflows)
uv add "git+https://github.com/puppyum/rote.git"

Local development:

git clone https://github.com/puppyum/rote.git
cd rote
uv venv --python 3.13 && source .venv/bin/activate
uv pip install -e ".[dev,all]"

Requires Python 3.12 or later. Apache-2.0.

Use

Three ways, ordered by how much you have to opt in.

Zero-config, paper-style

Prefix your script invocation:

rote run analyze.py

The CLI AST-wraps every top-level function in your script and in any helper modules it imports. Run the script a second time after a downstream edit; only the changed function re-executes.

Decorator

When you want to be explicit:

import rote

@rote.cache
def build_features(df):
    ...

Inside a notebook or REPL

import rote
with rote.auto():
    result = my_pipeline(data)

In Jupyter, %load_ext rote makes every cell a memoization candidate.

What gets cached

A function call is memoized when all of these hold:

It ran for at least min_duration_s (default 1 s). Below that, the cache write costs more than re-running.
No impure I/O happened during the call. Network, subprocess, file appends, exec/eval, and stdlib non-determinism sources (time.time(), random.random(), uuid.uuid4(), os.environ) all disqualify it.
No argument mutated. Arguments are fingerprinted on entry and re-checked on exit.
The function's source, every function it transitively calls, and every file it read are unchanged from the cached version.

If any check fails, the cache misses and the function runs. A cached value that can't be proven safe never gets returned; the tests/correctness/ suite includes 36 perturbation tests and 60 differential tests that fail loudly if a cached value drifts from a fresh run.

The serializer dispatches by type: Arrow IPC for DataFrames, numpy.save for arrays, safetensors for Torch tensors, msgpack for primitives, cloudpickle as a last resort. Rationale in docs/DECISIONS.md.

Measured performance

Apple Silicon, Python 3.13. Warm-hit timings are medians of 20 iterations; the cross-process and pipeline numbers are medians of 5 runs.

Per-function warm-hit cost against joblib.Memory:

Workload	joblib warm	rote warm	speedup
2 M-term Leibniz	96 µs	31 µs	3.09×
Basel sum	101 µs	30 µs	3.37×
400×400 NumPy QR	253 µs	33 µs	7.68×
200K-char bag-of-words	93 µs	31 µs	2.97×
200×200 matrix inverse	104 µs	49 µs	2.14×

Geomean across the five workloads: 3.48× faster than joblib.Memory.

On the paper-style multi-stage pipeline (parse → aggregate → format), with an edit to the final stage and everything in one process: plain Python re-runs the whole thing in 264 ms; rote skips the upstream stages and finishes the warm run in 6.3 ms, about 42× faster than the cold pipeline. joblib.Memory is faster on the same benchmark (1.4 ms warm) because it keys purely on argument values, where rote content-hashes the intermediate files on every hit so a mtime-preserving edit cannot return a stale result.

The tradeoff at the level you actually live with — edit, save, rerun, fresh Python process each time:

	wall-clock	vs plain
plain Python (whole pipeline)	1.83 s	—
`rote` warm (fresh interpreter)	0.38 s	4.8×
`joblib` warm (fresh interpreter)	0.19 s	9.6×

A persistent stat → content-hash table in the cache store is what keeps rote's file-dep validation cheap across process boundaries: each warm subprocess does a stat() per dependency and reuses the stored hash unless (size, mtime_ns, ctime_ns) change. Joblib still wins here because it skips content validation outright. Full numbers, the correctness/speed tradeoff, and a serializer breakdown live in docs/BENCHMARKS.md.

Test suite: 381 tests pass, including 60 differential and 36 perturbation tests. On the corpus/realistic/ subset (five multi-second scripts), auto-mode eliminates 100% of cold compute on warm re-run. mypy --strict and ruff clean. CI runs Linux, macOS, and Windows on Python 3.12 and 3.13.

Public API

Name	Purpose
`rote.cache`	Decorator. The explicit escape hatch.
`rote.auto()`	Context manager. Every call inside the block is a candidate.
`rote.invalidate(target=None)`	Drop entries. `target` is a function, a qualname string, or `None` for everything.
`rote.clear()`	Wipe all tiers (in-memory + SQLite + blobs).
`rote.configure(**kwargs)`	Override defaults (cache dir, `min_duration_s`, fsync, telemetry, ...).
`rote.stats()`	Hits, misses, time saved, invalidation reasons.
`rote.graph()`	A `networkx.DiGraph` of observed caller → callee edges.
`rote run <script>`	CLI: run a script under auto-mode.
`rote status`	CLI: print stats for the cache in the CWD.
`rote clear`	CLI: wipe the cache in the CWD.

Layout

src/rote/         the package (13 modules, ~4K lines)
tests/            unit / property / integration / correctness suites
docs/             architecture, decisions log, benchmarks, evaluation
bench/            workload + serializer microbenchmarks
corpus/           30 fast scripts for differential tests, plus a realistic/ subset for coverage
examples/         demos used by the integration tests

Architecture in detail: docs/architecture.md. Every paper deviation logged: docs/DECISIONS.md. Recent changes: CHANGELOG.md.

Limitations

Python 3.12+ only. sys.monitoring (PEP 669) is the load-bearing primitive; there's no fallback for older interpreters.
Functions doing real I/O are skipped. Network reads, append-mode file writes, and subprocess calls all disqualify a call. The system is built for compute-heavy steps that take a data file in and return a value out.
First run pays an AST-transform cost. Auto-mode rewrites your script through libcst once per source change; the rewrite is cached on disk after that.
The 1-second default threshold is conservative. Sub-second calls aren't memoized unless you lower it explicitly with rote.configure(min_duration_s=0.05).

License

Apache-2.0. See LICENSE.

Citing IncPy

If you use rote in academic work, cite the original paper:

Guo, P. J., & Engler, D. (2011). Using automatic persistent memoization to
facilitate data analysis scripting. Proceedings of the 2011 International
Symposium on Software Testing and Analysis (ISSTA '11), 287–297.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

puppyum

These details have not been verified by PyPI

Project links

Companion site

Release history Release notifications | RSS feed

This version

0.1.0

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rote-0.1.0.tar.gz (249.5 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rote-0.1.0-py3-none-any.whl (52.8 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file rote-0.1.0.tar.gz.

File metadata

Download URL: rote-0.1.0.tar.gz
Upload date: May 20, 2026
Size: 249.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rote-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`227e25f2cca04b7f28f0c7e876a0927e237baf91519d95ac806ba2dbdc111230`
MD5	`1fe48bfbf695e1f5ffa214113c796451`
BLAKE2b-256	`78b211e35a45e84191b5f45dd6bc0da1dd1967bb1865f58829197817dd58ac5c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rote-0.1.0.tar.gz:

Publisher: release.yml on puppyum/rote

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rote-0.1.0.tar.gz
- Subject digest: 227e25f2cca04b7f28f0c7e876a0927e237baf91519d95ac806ba2dbdc111230
- Sigstore transparency entry: 1576176713
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: puppyum/rote@8bb488e416930343e80178fe75833b40cf386b94
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/puppyum
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8bb488e416930343e80178fe75833b40cf386b94
- Trigger Event: push

File details

Details for the file rote-0.1.0-py3-none-any.whl.

File metadata

Download URL: rote-0.1.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 52.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rote-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ffb1857d3a04ea6d6a2c84e6183a0c4f306aee5579d1f814c5ee62c4a1e49505`
MD5	`bafb6dfb1394a9c1dcba8b66dc0050ae`
BLAKE2b-256	`2b596dc85646d14d8ad12bd6a9125e1db155ee97fcb7618231b627ac35919ad2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rote-0.1.0-py3-none-any.whl:

Publisher: release.yml on puppyum/rote

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rote-0.1.0-py3-none-any.whl
- Subject digest: ffb1857d3a04ea6d6a2c84e6183a0c4f306aee5579d1f814c5ee62c4a1e49505
- Sigstore transparency entry: 1576176721
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: puppyum/rote@8bb488e416930343e80178fe75833b40cf386b94
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/puppyum
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8bb488e416930343e80178fe75833b40cf386b94
- Trigger Event: push

rote 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rote

Why

Install

Use

Zero-config, paper-style

Decorator

Inside a notebook or REPL

What gets cached

Measured performance

Public API

Layout

Limitations

License

Citing IncPy

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance