Skip to main content

A read/write dataset client for the Layers format, built on didactic.

Project description

lairs

A read/write dataset client for the Layers format, built on didactic.

CI Docs PyPI Python 3.14+ License: MIT

Tutorial · Guides · Concepts · API · Development


lairs is a Python client for reading and writing data in the Layers format. It downloads pub.layers.* records from ATProto Personal Data Servers, validates them against models generated from the Layers lexicons, holds them in memory or in a local content-addressed store, and exposes them through a datasets-like API with tooling for the modalities Layers carries: audio, video, and time-series signals. On the write side it constructs records, uploads media blobs, and publishes records in bulk to the authenticated user's own repository, with the local store doubling as schema-aware version control.

The mental model: datasets and git for decentralised linguistic annotation.

lairs is built on didactic, which is built on panproto. Every structured value in lairs is a didactic model. The project never uses dataclasses, pydantic, or ad-hoc classes for its data, and type hints never use Any.

The ATProto lexicons are the single source of truth. The pub.layers.* models are not written by hand. They are generated from the vendored lexicons and committed to the repository. Updating to a new Layers version is a re-vendor, a regeneration, and a drift check (lairs gen --check).

Installation

The core install carries no integration dependencies. Each integration is an optional extra, discovered at runtime through entry points, so importing lairs never imports an integration's dependency.

pip install lairs                 # core
pip install "lairs[hf]"           # HuggingFace datasets and Hub
pip install "lairs[torch]"        # PyTorch exporter
pip install "lairs[audio]"        # audio decoding
pip install "lairs[conllu]"       # the CoNLL-U codec

Usage

import lairs

corpus = lairs.load_corpus(
    "at://did:plc:abc/pub.layers.corpus.corpus/ud-en",
    source="pds",
)
print(len(corpus.expressions))
print(corpus.expressions[0].text)

The lairs command vendors lexicons, regenerates models, and pulls, materialises, publishes, and inspects corpora:

lairs gen --check          # fail if the committed models drift from the lexicons
lairs pull did:plc:abc     # ingest an account's records into a local repository
lairs materialize <uri>    # build Arrow and Parquet views
lairs publish --repo ... --revision v0.1 --to did:plc:abc   # dry-run plan by default

Documentation

The documentation follows the Diátaxis structure: a tutorial, task-oriented guides, conceptual explanation, and an API reference rendered from the source docstrings. Build it locally with:

uv run --group docs mkdocs serve

Development

uv sync
uv run ruff format --check lairs tests
uv run ruff check lairs tests
uv run ty check
uv run pytest                    # unit tests only
uv run pytest --run-integration  # include integration tests (docker, network, extras)

See CONTRIBUTING.md for the full contribution guide and the Development section of the documentation for testing, code generation, and the release process. All participants are expected to follow the Code of Conduct.

Changelog

Notable changes are recorded in CHANGELOG.md.

License

lairs is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lairs-0.3.0.tar.gz (695.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lairs-0.3.0-py3-none-any.whl (418.9 kB view details)

Uploaded Python 3

File details

Details for the file lairs-0.3.0.tar.gz.

File metadata

  • Download URL: lairs-0.3.0.tar.gz
  • Upload date:
  • Size: 695.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lairs-0.3.0.tar.gz
Algorithm Hash digest
SHA256 18bebdae948ee2cee0cd5c58010780a8bd5e6b6cf81a6ddaff8c1f600cb77e43
MD5 41b6f4b0d692f55c2fdfb52163edc784
BLAKE2b-256 44612b6f0761e05b60149fb9e4efd334c2a35ccf3c0aee2350fcbbcfbf68e9c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for lairs-0.3.0.tar.gz:

Publisher: release.yml on layers-pub/lairs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lairs-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lairs-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 418.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lairs-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5cd66a4253bef36c132ecb3a0106e200d1226b9bad10a1ddd98bd146f308e7d4
MD5 e9b06cf492d1aeeba080d884663be8c0
BLAKE2b-256 265c0a07d8ee50f8601cd2a7b79a36ee87375889191baec9ec5fd6a61e5db106

See more details on using hashes here.

Provenance

The following attestation bundles were made for lairs-0.3.0-py3-none-any.whl:

Publisher: release.yml on layers-pub/lairs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page