Skip to main content

A primitive for online sequential aggregation in Python

Project description

livefold

live + fold — incremental folds over a mutable sequence. ("live" as in "live broadcast" — rhymes with "five", not "give".)

Build Status Supported Versions license

A primitive for online sequential aggregation in Python. Maintain a mutable numeric sequence; query exact aggregates over any range in O(√n); plug in any associative reducer (any monoid).

When to reach for it

Need Use
Immutable series, range aggregates Prefix sums
Frequent point updates, log-time queries Segment tree / Fenwick tree
Fixed-width rolling windows pandas.rolling() / polars.rolling()
Mutable series, arbitrary range queries, multi-fold livefold

Common shapes: request latencies, sensor readings, trade prices, telemetry events.

Quickstart

pip install livefold
from livefold import LiveFold

lf = LiveFold([1, 2, 3, 4, 5, 6], folds={"sum": sum, "max": max, "min": min})

lf.append(7)
lf.query(0, 5)
# → {"sum": 21, "max": 6, "min": 1}

# Mutate freely; aggregates stay current
lf[2] = -1
lf.query(2, 5)
# → {"sum": 9, "max": 6, "min": -1}

Performance

Query latency vs collection size

At n = 10⁷, livefold's median range query is 69 µs vs naive Python's 29 ms (~400× faster), and median append cost is ~2 µs across all n while every other backend with a competitive query path (numpy, pandas) degrades linearly on every append. livefold's amortized append is O(√n) — boundary-crossing rebuilds are rare (~one per √n appends) but real — so the median characterizes typical streaming latency, not the rare rebuild spike.

Full methodology, append benchmarks, comparison against four backends, and the reproduction script: benchmarks/.

Examples

Two runnable Streamlit demos in examples/:

  • system_metrics/ — live psutil-driven CPU/memory dashboard with arbitrary-range aggregate queries. Runs entirely offline.
  • crypto_ticks/ — synthetic BTC/USD tick stream with high/low/avg-price queries. Includes a drop-in recipe for real Binance ticks.

API

LiveFold(data: Iterable, folds: dict[str, Callable])
Member Returns Notes
lf.append(x) None Amortized O(√n); worst-case O(n) on perfect-square crossings
lf.query(left, right) dict[str, Any] O(√n); inclusive bounds
lf.blocks list[list] Underlying √n-sized blocks
lf.folded_values dict[str, list] Per-fold, per-block aggregates
lf.insert / pop / extend / remove / sort / ... Standard list methods; blocks and folds updated in place

LiveFold subclasses list, so it's a drop-in for any code that expected a plain list — until you start calling query.

Time-indexed queries: TimeIndexedLiveFold

For monotonic streams where you want to query by time instead of index, TimeIndexedLiveFold carries a parallel timestamp per element and exposes query_time_range — still O(√n).

from livefold import TimeIndexedLiveFold

lf = TimeIndexedLiveFold(
    [1, 2, 3],
    folds={"sum": sum, "max": max},
    timestamps=[1.0, 2.0, 3.0],
)
lf.append(4, timestamp=4.0)
lf.append(5, timestamp=5.0)

lf.query_time_range(2.0, 4.0)
# → {"sum": 9, "max": 4}

If you omit timestamps / timestamp, time.time() is used by default. The class is generic over the timestamp type — anything orderable works (float, int, datetime, ...). Pass explicit timestamps for any non-float type:

from datetime import datetime
from livefold import TimeIndexedLiveFold

lf = TimeIndexedLiveFold[datetime](
    [1, 2, 3],
    folds={"sum": sum},
    timestamps=[datetime(2025, 1, 1), datetime(2025, 6, 1), datetime(2026, 1, 1)],
)
lf.query_time_range(datetime(2025, 5, 1), datetime(2026, 6, 1))
# → {"sum": 5}

TimeIndexedLiveFold is append-only by design. Operations that would break time ordering raise MonotonicityError:

  • insert, sort, reverse — would break the ordering invariant
  • slice __setitem__, slice __delitem__ — would desync data and timestamps
  • append / extend with a timestamp earlier than the last stored one
  • + / += with anything other than another TimeIndexedLiveFold (use extend(values, timestamps=...) instead)

Integer indexing (lf[i] = x, del lf[i]), pop, remove, and clear work normally and keep the parallel timestamp list in sync. + and += between two TimeIndexedLiveFold instances concatenate timestamps and re-check monotonicity.

How it works

Query animation

LiveFold splits its underlying list into ⌊√n⌋ blocks of size √n, precomputes the configured folds for each block, and updates them incrementally on mutation. A query(left, right) walks at most two partial blocks plus the precomputed folds for whole-block spans in between — touching roughly 2√n elements per fold regardless of n. Mo's algorithm with mutability and a dict-shaped output.

Append re-block animation

append is amortized O(√n). Most appends just push onto the last block at O(1), but each time n crosses a perfect square, block_size = isqrt(n) increments and the whole structure re-blocks — an O(n) cost. The gap between consecutive perfect squares is 2k + 1 (linear in k = √n, not geometric), so over n appends total rebuild work sums to Σ_{k=1}^{√n} O(k²) = O(n^(3/2)) — i.e. O(√n) per append amortized. Boundary crossings happen only ~once per √n appends, though, so the median append latency stays in the low µs range across all n (this is what the benchmarks show); the asymptotic only shows up in mean or total cost.

TimeIndexedLiveFold layers a parallel monotonically non-decreasing timestamp list on top. query_time_range(start, end) calls bisect_left/bisect_right to map timestamps to indices in O(log n), then routes through the same √n-decomposed query path — so overall query cost stays O(√n).

For the full derivation, complexity analysis, and other implementation details, see the corresponding blog post.

Fold contract

A fold is a single-argument callable fn(items) -> result that:

  1. Accepts an iterable of elements (or, when called internally to combine block results, an iterable of prior fold results) and returns a single value.
  2. Is associative: fn([fn([a, b]), fn([c, d])]) == fn([a, b, c, d]). This lets query combine precomputed block-level folds with the partial folds at each end.
  3. Returns a value of the same shape it accepts as elements — i.e., feeding the result back through fn together with other results must work. len is a common-but-broken choice: it returns an int regardless of input shape, so len([len(block_a), len(block_b)]) gives 2, not the total element count.

Examples that satisfy the contract: sum, max, min, math.prod, math.gcd (via functools.reduce), bitwise or/and/xor, "".join, and any mergeable sketch (t-digest, HyperLogLog, Count-Min, Welford) wrapped in a fold-shaped callable. Commutativity is not required — string concatenation, matrix multiplication, and other ordered monoids work too.

Constraints

  • Not thread-safe. Single-process, single-thread workloads only.

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livefold-0.1.1.tar.gz (149.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livefold-0.1.1-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file livefold-0.1.1.tar.gz.

File metadata

  • Download URL: livefold-0.1.1.tar.gz
  • Upload date:
  • Size: 149.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for livefold-0.1.1.tar.gz
Algorithm Hash digest
SHA256 17e54b0d3a664f6e78c38231a996d079af22b40b5c8eb35b0483e3654f25857b
MD5 cd912f68d70c0ab30e4995e7cd687371
BLAKE2b-256 2f5c0e76101bbca86e97a2d9b52dd2abba66991642b3b81c924c6a1e708f66be

See more details on using hashes here.

File details

Details for the file livefold-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: livefold-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for livefold-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2cd13b1050eb6492d56f095f2b5e29171864a8491917b2d321694894ee7616a0
MD5 da89ead1eb7e69a7b12fcd4c8c5f3b14
BLAKE2b-256 75ddec5f3eaa3d9f0c62a245b51d262c219cf3094a4089a304f56b7c869dc07a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page