Interval annotation system
Project description
lacing
A standoff, interval-keyed annotation system. Pythonic core: a
MutableMapping[TimeInterval, list[Annotation]] facade with rational time,
ELAN-style tier stereotypes, and Allen's interval algebra. Designed for
time-based media (audio, video, speech, music) but generalizes to any 1-D
interval domain.
Status: Phase 0–1. Core data model, in-memory + SQLite stores, five round-trip adapters (Praat TextGrid, WebVTT, W3C Web Annotation,
.annotSQLite, ELAN EAF), inter-annotator agreement metrics, and alacingCLI (convert,query,validate,list-formats). Server and frontend are on the roadmap (seemisc/docs/Lacing Development Roadmap.md).
Install
pip install lacing # core only
pip install 'lacing[textgrid]' # + Praat TextGrid support (praatio)
pip install 'lacing[eaf]' # + ELAN EAF support (pympi-ling)
pip install 'lacing[postgres]' # + PostgresStore (psycopg + GiST + EXCLUDE)
30-second tour
from lacing.adapters import textgrid, webvtt, web_annotation # registers each
from lacing.adapters import load, dump
# Load a Praat TextGrid → an in-memory store keyed by interval
store = load("speech.TextGrid", rate=1000)
# Query overlaps using Allen's relations
from lacing.time import RationalTime, TimeInterval
window = TimeInterval(RationalTime(500, 1000), RationalTime(1500, 1000))
for ann in store.intersects(window):
print(ann.tier, ann.body["text"])
for ann in store.during(window): # strictly inside the window
...
# Save out as WebVTT
dump(store, "speech.vtt", format="webvtt")
# Or as W3C Web Annotation JSON-LD
dump(store, "speech.jsonld", format="web_annotation")
What's in the core
lacing/
├── time.py RationalTime + TimeInterval — rational, half-open, never float
├── model.py Annotation envelope + Reference union + Provenance (PROV-O subset)
├── tier.py Tier + 5 ELAN tier stereotypes + constraint validator
├── allen.py 13 Allen relations + intersects + relate + composition
├── store/
│ ├── base.py IntervalAnnotationStore (MutableMapping facade)
│ ├── memory.py MemoryStore over `intervaltree`
│ ├── sqlite.py SqliteStore — persistent backend + .annot file format
│ └── postgres.py PostgresStore — int8range + GiST + per-tier EXCLUDE
├── adapters/
│ ├── textgrid.py Praat .TextGrid (interval + point tiers)
│ ├── webvtt.py .vtt subtitles/captions
│ ├── web_annotation.py W3C Web Annotation Data Model (JSON-LD)
│ ├── annot.py .annot SQLite portable file format (lossless)
│ └── eaf.py ELAN EAF (4 stereotypes verbatim)
├── cli.py `lacing` CLI: convert, query, validate, list-formats
└── quality.py Cohen's κ, Krippendorff's α, interval IoU, boundary IoU
Design rules in one breath
- Time is rational —
RationalTime(value: int, rate: int). Wire format{v, r}. Never floats. - Standoff — annotations reference media by
(asset_id, interval); source is immutable. - One envelope, typed body —
Annotation.body: dictvalidated bybody_schema_uri(semver). - Allen's algebra is the public predicate API — never write ad-hoc overlap checks.
- ELAN tier stereotypes verbatim —
NONE,TIME_SUBDIVISION,INCLUDED_IN,SYMBOLIC_SUBDIVISION,SYMBOLIC_ASSOCIATION. - PROV-O provenance inline on every annotation —
was_generated_by,was_attributed_to,was_derived_from,generated_at_time. - MIT/BSD/Apache licenses only.
The full reasoning lives in misc/docs/ — four design docs
covering annotation systems generally, backend architecture, frontend UI,
and an OSS deep-dive of what to build on. The synthesized plan is in
misc/docs/Lacing Development Roadmap.md.
Concrete recipes
Build annotations programmatically
from uuid import uuid4
from lacing import (
Annotation, MediaRef, MemoryStore, Provenance,
RationalTime, TimeInterval, Tier,
)
store = MemoryStore()
store.add_tier(Tier("words"))
store.add(Annotation(
id=uuid4(),
tier="words",
reference=MediaRef(
asset_id="blake3:abc123",
interval=TimeInterval.from_seconds("0.0", "0.5", rate=1000),
),
body={"text": "hello"},
body_schema_uri="annot://schema/word/v1",
provenance=Provenance(
was_generated_by="user:thor",
was_attributed_to="thor",
generated_at_time=RationalTime.zero(1000),
),
))
Query with Allen's relations
from lacing.allen import AllenRelation
from lacing.time import RationalTime, TimeInterval
w = TimeInterval(RationalTime(0, 1000), RationalTime(500, 1000))
list(store.intersects(w)) # any overlap
list(store.during(w)) # strictly inside w
list(store.contains(w)) # strictly contains w
list(store.relate(w, [AllenRelation.MEETS])) # ends at w.start
Persist annotations
from lacing.store import SqliteStore
# Open or create a .annot file (SQLite under the hood)
store = SqliteStore("project.annot")
store.add_tier(...)
store.add(...) # writes go straight to disk
store.set_meta("project", "demo")
# Same MutableMapping + Allen-relation interface as MemoryStore
for ann in store.intersects(window):
...
store.close()
The .annot file is the recommended portable handoff format — single-file
SQLite, Git-trackable, lossless round-trip with MemoryStore.
For multi-user / production scale, the same facade is available over PostgreSQL:
from lacing.store import PostgresStore
from lacing.tier import Tier
store = PostgresStore("postgresql://localhost/myproject", rate=1000)
# Per-tier non-overlap is enforced declaratively by the database — try to
# add an overlapping annotation in this tier and Postgres rejects the insert.
store.add_tier(Tier("speakers"), enforce_no_overlap=True)
The Postgres backend uses int8range + GiST (sub-millisecond overlap
queries at million-row scale) and exposes the same Allen-relation
methods. Times are normalized to a project-wide rate stored in meta.
CLI
After pip install -e . the lacing command is on your PATH:
lacing list-formats # show registered adapters
lacing convert speech.TextGrid speech.annot # convert between formats
lacing query speech.annot --start 1.0 --end 5.0 --rate 1000 # JSON-lines
lacing validate speech.annot # parse + summary
Inter-annotator agreement
from lacing.quality import cohen_kappa, krippendorff_alpha, boundary_iou
# Two annotators on a categorical task
kappa = cohen_kappa(["A", "B", "A", "B"], ["A", "A", "A", "B"])
# Three annotators with missing data
alpha = krippendorff_alpha([
["A", "B", None, "C"],
["A", "B", "B", "C"],
["A", "A", "B", "C"],
])
# Compare two segmentations
score = boundary_iou(
[a.interval for a in store_a.by_tier("speakers")],
[a.interval for a in store_b.by_tier("speakers")],
)
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lacing-0.0.2.tar.gz.
File metadata
- Download URL: lacing-0.0.2.tar.gz
- Upload date:
- Size: 165.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a43162164f72dea0fbc69b3d6a233ea6103deaaa4ecc751c02bdadd83d4fa7a
|
|
| MD5 |
36454399616130415050eea5f09b1a21
|
|
| BLAKE2b-256 |
952717b95b42c7f9ec730901198933b8c9f338a60df82ac43a1eff3645173c41
|
File details
Details for the file lacing-0.0.2-py3-none-any.whl.
File metadata
- Download URL: lacing-0.0.2-py3-none-any.whl
- Upload date:
- Size: 60.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
497ea60220bdc7b5c2f17d29e877f09eac47937caca7e6e21c8e7f9d4cc49fe8
|
|
| MD5 |
5a89ebbfcdf39346e7d23b8b28d9162f
|
|
| BLAKE2b-256 |
492ac5b641f4973157709e553aac8f10310b2c4d01cdf8d27dfe444c724e2a5b
|