Skip to main content

A read/write dataset client for the Layers format, built on didactic.

Project description

lairs

A read/write dataset client for the Layers format, built on didactic.

CI Docs PyPI Python 3.14+ License: MIT

Tutorial · Guides · Concepts · API · Development


lairs is a Python client for reading and writing data in the Layers format. It downloads pub.layers.* records from ATProto Personal Data Servers, validates them against models generated from the Layers lexicons, holds them in memory or in a local content-addressed store, and exposes them through a datasets-like API with tooling for the modalities Layers carries: audio, video, and time-series signals. On the write side it constructs records, uploads media blobs, and publishes records in bulk to the authenticated user's own repository, with the local store doubling as schema-aware version control.

The mental model: datasets and git for decentralised linguistic annotation.

lairs is built on didactic, which is built on panproto. Every structured value in lairs is a didactic model. The project never uses dataclasses, pydantic, or ad-hoc classes for its data, and type hints never use Any.

The ATProto lexicons are the single source of truth. The pub.layers.* models are not written by hand. They are generated from the vendored lexicons and committed to the repository. Updating to a new Layers version is a re-vendor, a regeneration, and a drift check (lairs gen --check).

Installation

The core install carries no integration dependencies. Each integration is an optional extra, discovered at runtime through entry points, so importing lairs never imports an integration's dependency.

pip install lairs                 # core
pip install "lairs[hf]"           # HuggingFace datasets and Hub
pip install "lairs[torch]"        # PyTorch exporter
pip install "lairs[audio]"        # audio decoding
pip install "lairs[conllu]"       # the CoNLL-U codec

Usage

import lairs

corpus = lairs.load_corpus(
    "at://did:plc:abc/pub.layers.corpus.corpus/ud-en",
    source="pds",
)
print(len(corpus.expressions))
print(corpus.expressions[0].text)

The lairs command vendors lexicons, regenerates models, and pulls, materialises, publishes, and inspects corpora:

lairs gen --check          # fail if the committed models drift from the lexicons
lairs pull did:plc:abc     # ingest an account's records into a local repository
lairs materialize <uri>    # build Arrow and Parquet views
lairs publish --repo ... --revision v0.1 --to did:plc:abc   # dry-run plan by default

Documentation

The documentation follows the Diátaxis structure: a tutorial, task-oriented guides, conceptual explanation, and an API reference rendered from the source docstrings. Build it locally with:

uv run --group docs mkdocs serve

Development

uv sync
uv run ruff format --check lairs tests
uv run ruff check lairs tests
uv run ty check
uv run pytest                    # unit tests only
uv run pytest --run-integration  # include integration tests (docker, network, extras)

See CONTRIBUTING.md for the full contribution guide and the Development section of the documentation for testing, code generation, and the release process. All participants are expected to follow the Code of Conduct.

Changelog

Notable changes are recorded in CHANGELOG.md.

License

lairs is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lairs-0.1.0.tar.gz (676.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lairs-0.1.0-py3-none-any.whl (407.0 kB view details)

Uploaded Python 3

File details

Details for the file lairs-0.1.0.tar.gz.

File metadata

  • Download URL: lairs-0.1.0.tar.gz
  • Upload date:
  • Size: 676.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lairs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7f576b6aa787078f192e0dd1b4cca1531d5c5c1d157e063d30cb059c1861cfcc
MD5 908dfdf5b939654bbee9e35f73eae5e3
BLAKE2b-256 03c059620365e49b5e4c24902536a31706355671d22fa2842d832e5edf321653

See more details on using hashes here.

Provenance

The following attestation bundles were made for lairs-0.1.0.tar.gz:

Publisher: release.yml on layers-pub/lairs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lairs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lairs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 407.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lairs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfdee0673e0264148b0a95b3b92898700bb0150e53166247336cc6245eeb906e
MD5 58e0aa8983c4ce6024dfc82878ab40fd
BLAKE2b-256 f7487a18b787682299ebbec7d9ebdbecaa2736f764f4269b59635b06025e5700

See more details on using hashes here.

Provenance

The following attestation bundles were made for lairs-0.1.0-py3-none-any.whl:

Publisher: release.yml on layers-pub/lairs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page