A primitive for online sequential aggregation in Python
Project description
livefold
live + fold — incremental folds over a mutable sequence.
A primitive for online sequential aggregation in Python. Maintain a mutable numeric sequence; query exact aggregates over any range in O(√n); plug in any associative reducer (any monoid).
When to reach for it
| Need | Use |
|---|---|
| Immutable series, range aggregates | Prefix sums |
| Frequent point updates, log-time queries | Segment tree / Fenwick tree |
| Fixed-width rolling windows | pandas.rolling() / polars.rolling() |
| Mutable series, arbitrary range queries, multi-fold | livefold |
Common shapes: request latencies, sensor readings, trade prices, telemetry events.
Quickstart
pip install livefold
from livefold import LiveFold
lf = LiveFold([1, 2, 3, 4, 5, 6], folds={"sum": sum, "max": max, "min": min})
lf.append(7)
lf.query(0, 5)
# → {"sum": 21, "max": 6, "min": 1}
# Mutate freely; aggregates stay current
lf[2] = -1
lf.query(2, 5)
# → {"sum": 9, "max": 6, "min": -1}
Performance
At n = 10⁷, livefold's median range query is 69 µs vs naive Python's 29 ms (~400× faster), and median append cost is ~2 µs across all n while every other backend with a competitive query path (numpy, pandas) degrades linearly on every append. livefold's amortized append is O(√n) — boundary-crossing rebuilds are rare (~one per √n appends) but real — so the median characterizes typical streaming latency, not the rare rebuild spike.
Full methodology, append benchmarks, comparison against four backends, and the reproduction script: benchmarks/.
Examples
Two runnable Streamlit demos in examples/:
system_metrics/— livepsutil-driven CPU/memory dashboard with arbitrary-range aggregate queries. Runs entirely offline.crypto_ticks/— synthetic BTC/USD tick stream with high/low/avg-price queries. Includes a drop-in recipe for real Binance ticks.
API
LiveFold(data: Iterable, folds: dict[str, Callable])
| Member | Returns | Notes |
|---|---|---|
lf.append(x) |
None |
Amortized O(√n); worst-case O(n) on perfect-square crossings |
lf.query(left, right) |
dict[str, Any] |
O(√n); inclusive bounds |
lf.blocks |
list[list] |
Underlying √n-sized blocks |
lf.folded_values |
dict[str, list] |
Per-fold, per-block aggregates |
lf.insert / pop / extend / remove / sort / ... |
— | Standard list methods; blocks and folds updated in place |
LiveFold subclasses list, so it's a drop-in for any code that expected a plain list — until you start calling query.
Time-indexed queries: TimeIndexedLiveFold
For monotonic streams where you want to query by time instead of index, TimeIndexedLiveFold carries a parallel timestamp per element and exposes query_time_range — still O(√n).
from livefold import TimeIndexedLiveFold
lf = TimeIndexedLiveFold(
[1, 2, 3],
folds={"sum": sum, "max": max},
timestamps=[1.0, 2.0, 3.0],
)
lf.append(4, timestamp=4.0)
lf.append(5, timestamp=5.0)
lf.query_time_range(2.0, 4.0)
# → {"sum": 9, "max": 4}
If you omit timestamps / timestamp, time.time() is used by default. The class is generic over the timestamp type — anything orderable works (float, int, datetime, ...). Pass explicit timestamps for any non-float type:
from datetime import datetime
from livefold import TimeIndexedLiveFold
lf = TimeIndexedLiveFold[datetime](
[1, 2, 3],
folds={"sum": sum},
timestamps=[datetime(2025, 1, 1), datetime(2025, 6, 1), datetime(2026, 1, 1)],
)
lf.query_time_range(datetime(2025, 5, 1), datetime(2026, 6, 1))
# → {"sum": 5}
TimeIndexedLiveFold is append-only by design. Operations that would break time ordering raise MonotonicityError:
insert,sort,reverse— would break the ordering invariant- slice
__setitem__, slice__delitem__— would desync data and timestamps append/extendwith a timestamp earlier than the last stored one+/+=with anything other than anotherTimeIndexedLiveFold(useextend(values, timestamps=...)instead)
Integer indexing (lf[i] = x, del lf[i]), pop, remove, and clear work normally and keep the parallel timestamp list in sync. + and += between two TimeIndexedLiveFold instances concatenate timestamps and re-check monotonicity.
How it works
LiveFold splits its underlying list into ⌊√n⌋ blocks of size √n, precomputes the configured folds for each block, and updates them incrementally on mutation. A query(left, right) walks at most two partial blocks plus the precomputed folds for whole-block spans in between — touching roughly 2√n elements per fold regardless of n. Mo's algorithm with mutability and a dict-shaped output.
append is amortized O(√n). Most appends just push onto the last block at O(1), but each time n crosses a perfect square, block_size = isqrt(n) increments and the whole structure re-blocks — an O(n) cost. The gap between consecutive perfect squares is 2k + 1 (linear in k = √n, not geometric), so over n appends total rebuild work sums to Σ_{k=1}^{√n} O(k²) = O(n^(3/2)) — i.e. O(√n) per append amortized. Boundary crossings happen only ~once per √n appends, though, so the median append latency stays in the low µs range across all n (this is what the benchmarks show); the asymptotic only shows up in mean or total cost.
TimeIndexedLiveFold layers a parallel monotonically non-decreasing timestamp list on top. query_time_range(start, end) calls bisect_left/bisect_right to map timestamps to indices in O(log n), then routes through the same √n-decomposed query path — so overall query cost stays O(√n).
For the full derivation, complexity analysis, and other implementation details, see the corresponding blog post.
Note: the blog post predates the rebrand from
pysquaggand uses the old singularaggregator_function=API. The math and structural choices are unchanged; only the package name and thefolds={"name": fn, ...}dict shape have evolved.
Fold contract
A fold is a single-argument callable fn(items) -> result that:
- Accepts an iterable of elements (or, when called internally to combine block results, an iterable of prior fold results) and returns a single value.
- Is associative:
fn([fn([a, b]), fn([c, d])]) == fn([a, b, c, d]). This letsquerycombine precomputed block-level folds with the partial folds at each end. - Returns a value of the same shape it accepts as elements — i.e., feeding the result back through
fntogether with other results must work.lenis a common-but-broken choice: it returns anintregardless of input shape, solen([len(block_a), len(block_b)])gives2, not the total element count.
Examples that satisfy the contract: sum, max, min, math.prod, math.gcd (via functools.reduce), bitwise or/and/xor, "".join, and any mergeable sketch (t-digest, HyperLogLog, Count-Min, Welford) wrapped in a fold-shaped callable. Commutativity is not required — string concatenation, matrix multiplication, and other ordered monoids work too.
Constraints
- Not thread-safe. Single-process, single-thread workloads only.
Contributing
See CONTRIBUTING.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livefold-0.1.0.tar.gz.
File metadata
- Download URL: livefold-0.1.0.tar.gz
- Upload date:
- Size: 407.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d518c4852f7cb2a1c80fb585d04e65c97e848de0b220b6c17a3f6cde2352bd3
|
|
| MD5 |
136da404b9db1d7cd215d7302029aa0c
|
|
| BLAKE2b-256 |
f19ed17bb76cb28d9bd9a42130b0f26c420266bb985b0f6487855807a2dc87c2
|
File details
Details for the file livefold-0.1.0-py3-none-any.whl.
File metadata
- Download URL: livefold-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
222e7bf341daef863328efac2faf1a6e39d5a028da0c560f8da2b4c1f8dcccf4
|
|
| MD5 |
e65fa226b4b390807946ece5c06e6fb7
|
|
| BLAKE2b-256 |
61a278ebf90b4caabeac06de01191129a9ad3ab72706eaad1562f35ffd673135
|