Skip to main content

Official Python SDK for the Entertainment Identifier Registry (EIDR)

Project description

eidr — Official Python SDK for the Entertainment Identifier Registry

PyPI Python License

eidr is the official Python SDK for the Entertainment Identifier Registry. It provides typed, ergonomic access to EIDR's Content, Party, and Video Service ID registries over the native XML REST API.

Status: 0.1.0rc1 — first release candidate. The public API contract documented in STABILITY.md takes effect at this release; remaining changes before 1.0.0 are limited to release-blocking defect corrections. See CHANGELOG.md for what's new and the STABILITY.md "Breaking changes by version" section if upgrading from a beta release.

What's implemented today

  • Codec layer — XML ↔ intermediate dict ↔ JSON conversion following the MovieLabs MDDF JSON Encoding Best Practice. Supports both infoset round-trip and canonical output (W3C Canonical XML 2.0 for XML; RFC 8785 JCS for JSON, when rfc8785 is installed).
  • Typed recordsContentRecord, PartyRecord, ServiceRecord wrappers over the codec dict, with structured property accessors, pre-submission validation per the EIDR submission profile, and a root-element-driven eidr.parse() dispatcher.
  • EIDR IDsEIDRID value type with ISO 7064 Mod 37,36 check-character validation and lossless conversions between canonical DOI form, URN form, and bare suffix.
  • Credentials — five sources, all unified under Credentials.load(): EIDR XML config file, JSON file, AWS Secrets Manager (with [aws] extra), environment variables, and direct construction.
  • Sync HTTP clienteidr.Client, gated on the [client] extra. Wraps transport, authentication, retry policy, and response parsing. Operations:
    • Content reads: resolve, query, graph_traversal, status_lookup.
    • Content writes: register, match, modify, delete, promote, alias, add_relationship, remove_relationship, replace_relationship. Each supports immediate=, async tokens, and optional polling with wait_timeout=.
    • Video Service: service_query (read), create_service, modify_service, delete_service, alias_service, set_service_parent, service_children.
    • Party: party_query (read), create_party, modify_party, delete_party, alias_party, activate_party, deactivate_party, change_party_password.
    • Virtual fields: virtual_fields(asset_id) returning a VirtualFields value object with full/self_defined/alias serialized views.
  • Async HTTP clienteidr.AsyncClient, same operations as Client but with async def methods backed by httpx.AsyncClient. Use with async with. AsyncToken mirrors Token with awaitable poll/wait/operation_result. The sync-under-load fallback case (registry deferring an immediate=True Create/Modify to async under load, per REST API §2.1.1) is handled transparently; Token.to_async(async_client) bridges a sync-fallback token into an async workflow when needed.
  • Tracing — built-in support-diagnostics facility. TraceSink protocol with LoggerSink, FileSink, and ListSink implementations. Sensitive headers (Authorization, Cookie, etc.) redacted at capture time. Body size limits and "safe support bundle" body-redaction modes available via TransportConfig.
  • Typed Digital sub-modeleidr.models.digital, gated on the [digital] extra. Auto-generated from the bundled XSD by xsdata-pydantic at SDK-development time and shipped pre-built; end users never run the codegen. Provides typed access to a Manifestation's <Digital> sub-block (audio / video / subtitle / interactive tracks; container-level packaging metadata) via manifestation.digital_typed() / set_digital() accessors. The raw codec dict on manifestation.digital is unchanged for pass-through and dedup workflows that compare whole-block equality. The codec dict layer remains the source of truth — the typed view is transient (each call materializes a fresh pydantic instance from the dict).
  • Type-hinted throughout (PEP 561 py.typed); mypy --strict clean across all 48 source files.

Implementation status

The SDK has reached functional and ergonomic parity with the Java SDK on the core public registry-operation surface as of 0.1.0b3 (M13). Java-only surfaces excluded by design — batch operations, user/ACL admin, UserOverride/impersonation — are listed under "Out of scope" below.

Read paths (all fully typed):

  • resolve() / resolve_party() / resolve_service() — single-record lookups. Returns typed ContentRecord / Party / Service.
  • query() — content asset query (XPath-style filter strings). Page-by-page access to a typed QueryResults.
  • iter_query() / iter_query_ids() — auto-paging iterators over asset-query results, yielding ContentRecord and ID strings respectively.
  • find_parties_by_name(), find_parties_from_catalog(), find_services_by_name(), find_services_from_catalog() — typed query helpers returning PartyQueryResults / ServiceQueryResults.
  • party_query() / service_query() — escape hatch for advanced query bodies; takes raw XML and returns ParsedResponse.
  • graph_traversal() / service_parent() / service_children() — service-graph reads.
  • modification_base() — fetch a record body suitable for use as a modify() starting point. Requires creation_type. Use modification_base_auto(asset_id) to resolve-then-infer when the type isn't known statically.
  • iter_status_by_user() / iter_status_by_registrant() / iter_status_superparty() / iter_status_by_token() — auto-paging iterators yielding OperationStatusEntry.
  • virtual_fields() — search-index virtual fields (full / self-defined views).

Write paths:

  • register() / modify() / delete() / alias() / promote() for content; create_party() / modify_party() / change_password() for parties; create_service() / modify_service() / delete_service() for services. Each is single-operation; batch is out of scope.
  • add_relationship() / replace_relationship() / remove_relationship() for asset relationships.

Async surface: every method has an exact async mirror on AsyncClient. The auto-paging iterators return AsyncIterator for use with async for.

Out of scope

The following surfaces are intentionally not exposed:

  • Batch operations (registerBatchFromXML, etc.) — the single-operation API is the public surface; callers wanting batches drop to the codec layer.
  • User admin and ACL admin (/user/*, AdminAcl.*) — internal Operations machinery, not for the public SDK.
  • Impersonation / UserOverride (per-operation user tokens with forced dedup flags) — Superparty-only feature, deferred pending a clean public surface design. Targeted post-1.0.

Roadmap to 1.0

Items planned but not blocking 1.0:

  • Schema validation expansion (SchemaSource to a full lxml URI resolver) — targeted for 1.1.

See the M9 cover letter for the full Python↔Java SDK variance table (kept current through M14). Subsequent cover letters narrate what changed in each milestone and link back to the variance table.

Installation

pip install eidr               # codec + records + IDs (no network)
pip install 'eidr[client]'     # adds the HTTP client (httpx)
pip install 'eidr[aws]'        # adds AWS Secrets Manager support
pip install 'eidr[digital]'    # adds the typed Digital sub-model (pydantic)
pip install 'eidr[client,aws]' # both

Quick start

Resolve

from eidr import Client, Credentials, registries

with Client(
    registries.SANDBOX2,
    Credentials.load(),  # or from_eidr_xml, from_json, from_aws_secret, etc.
) as client:
    record = client.resolve("10.5240/0000-02ED-1DCE-6AAF-99F7-M")
    print(record.id, record.resource_name)

Register a new record (synchronous)

from eidr.models.content import ContentRecord

record = ContentRecord.from_xml(my_record_bytes)
created = client.register(record, immediate=True)
print("Assigned ID:", created.id)

Register with deferred polling (sync Client, registry-async write)

token = client.register(record, immediate=False)
# Persist token.value if you need to resume later.
result = token.operation_result(timeout=120)
if result.status.name == "SUCCESS":
    print("Registered:", result.id)
elif result.status.name == "PENDING":
    print("Still pending; sub-tokens:", result.sub_tokens)

Async workflow with AsyncClient

For programs using asyncio, AsyncClient mirrors Client's API with async def methods. Every operation — resolve, register, match, modify, delete, promote, alias, relationship ops, query, graph_traversal, party_query, service_query, status_lookup — has an async counterpart with the same signature. Returned Tokens become AsyncTokens whose poll/wait/operation_result methods are awaitable.

import asyncio
from eidr import AsyncClient, Credentials, registries

async def main():
    async with AsyncClient(
        registries.SANDBOX2,
        Credentials.load(),
    ) as client:
        # Read
        record = await client.resolve("10.5240/0000-02ED-1DCE-6AAF-99F7-M")

        # Registry-async write (AsyncToken returned; awaitable poll)
        token = await client.register(new_record, immediate=False)
        result = await token.operation_result(timeout=120)
        print(f"{result.status.name}: {result.id}")

asyncio.run(main())

Sync-under-load fallback. Per EIDR REST API §2.1.1, an immediate=True Create or Modify can be deflected to async by the registry when dedupe can't complete within the response window. Both Client.register and AsyncClient.register handle this: immediate=True will usually return a ContentRecord directly, but may return a Token/AsyncToken if the registry deferred. Deflection does not apply to match(), which always resolves inline.

Production callers should always handle both return types. The recommended idiom:

from eidr import Client, Token, ContentRecord, registries

with Client(registries.SANDBOX2, creds) as client:
    result = client.register(record, immediate=True)

    if isinstance(result, Token):
        # Registry deferred under load. Persist token.value if you
        # want to survive a process restart, then poll.
        op = result.operation_result(timeout=120)
        registered = op.record  # ContentRecord, or None on failure
    else:
        # Registry handled it inline.
        registered = result  # ContentRecord

    if registered is not None:
        print("Registered:", registered.id)

Equivalent for AsyncClient:

from eidr import AsyncClient, AsyncToken, ContentRecord, registries

async with AsyncClient(registries.SANDBOX2, creds) as client:
    result = await client.register(record, immediate=True)

    if isinstance(result, AsyncToken):
        op = await result.operation_result(timeout=120)
        registered = op.record
    else:
        registered = result

The SDK does not auto-wrap this. Auto-wrapping would hide the fallback case from callers who legitimately want to know whether their immediate registration completed inline (faster, no extra round-trips) versus deferred (caller may want to release the worker slot, queue the polling, etc.). The isinstance check is two lines of boilerplate for a meaningful semantic distinction.

If you already have a sync Token (e.g., from a legacy sync call site) and want to await its completion from an async workflow, Token.to_async(async_client) converts it into an AsyncToken without re-issuing the write. The reverse conversion is not offered — running asyncio.run() from within sync code is almost always a sign of something wrong elsewhere.

Query

results = client.query(
    "/FullMetadata/BaseObjectData/ReferentType IS Movie",
    page_number=1,
    page_size=50,
)
for record in results.records:
    print(record.id, record.resource_name)
print(f"Page 1 of ~{(results.total_matches + 49) // 50}")
if results.has_more_pages:
    next_page = client.query(..., page_number=2, page_size=50)

Graph traversal

from eidr import GraphTraversalType

descendants = client.graph_traversal(
    GraphTraversalType.FIND_DESCENDANTS,
    series_id,
    referent_type_filter="TV",
)

Tracing for support diagnostics

from eidr import FileSink

with Client(..., tracing=FileSink("/tmp/eidr-trace.log")) as client:
    client.resolve("10.5240/...")
# Trace file now contains every HTTP request/response with sensitive
# headers redacted. For sharing with third parties, also enable
# trace_redact_bodies=True via TransportConfig.

Video Service writes

from eidr.models.service import ServiceRecord

with Client(registry, creds) as client:
    # Create a new service. The registry assigns the ID.
    new_svc = ServiceRecord.from_xml(b"""<?xml version="1.0"?>
        <Service xmlns="http://www.eidr.org/schema">
          <ServiceName>
            <DisplayName>My Streaming Service</DisplayName>
            <SortName>My Streaming Service</SortName>
          </ServiceName>
          <Active>true</Active>
        </Service>""")
    created = client.create_service(new_svc)
    # created is a ServiceRecord with the registry-assigned ID.

    # Modify (full record body required — the registry replaces all content)
    updated = client.modify_service(updated_svc)

    # Simple ops (return None on success)
    client.alias_service("10.5239/AAAA-BBBB", target_id="10.5239/CCCC-DDDD")
    client.set_service_parent("10.5239/AAAA-BBBB", parent_id="10.5239/PPPP-QQQQ")
    client.delete_service("10.5239/AAAA-BBBB")

    # service_children returns the parsed envelope
    response = client.service_children("10.5239/AAAA-BBBB")

Party writes

The Superparty gate. Party-administration operations (create_party, modify_party, delete_party, alias_party, activate_party, deactivate_party, change_party_password) are restricted by the EIDR registry to a single hard-coded Party — the "Superparty" with ID 10.5237/superparty. Unlike Service writes, there is no Role mechanism that lets the registry delegate this authority. The SDK enforces this client-side via a strict-by-default gate: invoking any of the seven destructive Party methods with a non-Superparty party_id raises EIDRSDKPolicyError before any HTTP traffic.

Configurable kwargs on Client / AsyncClient:

  • superparty_id: str = "10.5237/superparty" — the ID the gate requires. The default is correct for all current EIDR registries (production, sandbox1, sandbox2, sandbox2-mirror).
  • enforce_superparty_gate: bool = True — set to False to bypass the gate. Useful only for testing the SDK itself or for the unusual case where the caller is the Superparty under a non-default ID.

Read operations (resolve_party, party_query) are never gated — they're open to any caller.

If you bypass the gate (or override the Superparty ID incorrectly), the registry will reject your request server-side with EIDRAuthorizationError. The gate exists to make the failure earlier and clearer — it is not a security boundary, just a usability one.

from eidr.models.party import PartyRecord

# The Superparty has its own credentials; ordinary clients will
# trip the gate immediately. Only Superparty-credentialed callers
# should use these operations in production.
with Client(registry, superparty_creds) as client:
    new_party = PartyRecord.from_xml(b"""<?xml version="1.0"?>
        <Party xmlns="http://www.eidr.org/schema">
          <PartyName>
            <DisplayName>Example Org</DisplayName>
            <SortName>Example Org</SortName>
          </PartyName>
        </Party>""")
    created = client.create_party(new_party, password="initial-password")

    # Modify a party (no password — use change_party_password for that)
    client.modify_party(updated_party)

    # Activate / deactivate / delete / alias (return None)
    client.activate_party("10.5237/AAAA-BBBB")
    client.deactivate_party("10.5237/AAAA-BBBB")
    client.alias_party("10.5237/AAAA-BBBB", target_id="10.5237/CCCC-DDDD")
    client.delete_party("10.5237/AAAA-BBBB")

    # Change a password (the password traverses the wire in the URL
    # query string — treat the URL as sensitive).
    client.change_party_password("10.5237/AAAA-BBBB", "new-password")

Virtual fields retrieval

vf = client.virtual_fields("10.5240/7791-8534-2C23-9030-8610-5")
# vf.id is always present
# vf.full / vf.self_defined / vf.alias are each str | None,
# carrying serialized record content.
if vf.full is not None:
    full_record = ContentRecord.from_xml(vf.full.encode("utf-8"))

File-based (codec-only) mode

from eidr.codecs import xml, json

# XML bytes in, JSON dict out
json_dict = xml.to_json_dict(xml_bytes)

# JSON dict in, canonical XML bytes out (signing-ready)
xml_bytes = json.to_xml_canonical(json_dict, method="c14n2")

Typed Digital sub-model (requires [digital] extra)

For programmatic inspection or construction of a Manifestation's <Digital> block — the audio / video / subtitle tracks and container-level packaging metadata — use the typed pydantic v2 model exposed at eidr.models.digital. The codec dict layer remains the source of truth; the typed view is transient (each digital_typed() call returns a fresh instance):

from eidr import Client, registries
from eidr.models.digital import (
    DigitalTracks, Track, DigitalAssetAudio,
)
# (also: DigitalAssetAudioLanguageType, etc., re-exported from the
#  full generated tree via `from eidr.models.digital import *`)

with Client(registries.SANDBOX2, creds) as client:
    record = client.resolve("10.5240/...some-manifestation-ID...")

    manifest = record.creation_block  # ManifestationInfo
    typed = manifest.digital_typed()  # → DigitalTracks | None
    if typed is not None:
        for entry in typed.track_or_container:
            if isinstance(entry, Track) and isinstance(
                entry.choice, DigitalAssetAudio
            ):
                print(entry.choice.language.value, entry.choice.type_value)

To write back, build a typed DigitalTracks and call set_digital:

from eidr.models import _digital_generated as gen

new_audio = gen.DigitalAssetAudioDataType(
    type_value="primary",
    language=gen.DigitalAssetAudioLanguageType(value="fr"),
)
new_track = gen.DigitalAssetMetadataType(choice=new_audio)
new_tracks = DigitalTracks(track_or_container=[new_track])

manifest.set_digital(new_tracks)  # codec dict updated under the hood

The raw codec dict at manifest.digital is unchanged from M5 and remains the right tool for whole-block equality comparisons (the common dedup case). The typed view is for programmatic field access where the schema-driven validation pays off.

Performance note: digital_typed() round-trips through XML serialization (codec dict → XML bytes → pydantic via xsdata-pydantic), which adds ~5-15 ms per call depending on manifestation size. For high-frequency Manifestation processing (e.g., walking thousands of records to extract track-level fields), prefer the codec-dict accessors (record.data, manifest.digital) which are pure dict access and ~100× faster. The typed view is a correctness-first ergonomic; the codec-dict is the performance-first interface. This characteristic is documented in STABILITY.md and won't change without a major-version bump.

Documentation

  • API reference under construction; the source modules carry full docstrings — pydoc eidr.client is a useful starting point.
  • The M6_COVER_LETTER.md in this drop documents the current scope, design findings, and reviewer-attention items in detail.

Relationship to other EIDR tools

  • EIDR Java SDK: the long-standing reference SDK. This Python library targets the same REST API with Python-native ergonomics.
  • eidr-cli (planned): command-line tools built on this library.

Contributing

Issues and pull requests welcome at github.com/EIDR-ID/eidr-python-sdk.

Development setup

The full developer install pulls in every optional extra so all tests, type checks, and example scripts work out of the box:

git clone https://github.com/EIDR-ID/eidr-python-sdk.git
cd eidr-python-sdk
pip install -e '.[client,digital,aws,dev,docs]'

If you skip an extra you'll see import errors when you run the parts of the codebase that need it — for example, [digital] brings in pydantic and xsdata-pydantic, which the typed Digital sub-model and a handful of unit tests require. [client] brings in httpx, which most non-codec tests assume. [dev] brings in the test/lint toolchain (pytest, ruff, mypy).

Test suite

pytest                       # full unit suite (fast, no network)
pytest -m integration        # live-sandbox tests (requires creds)
pytest -m ""                 # everything, including integration

The default pytest invocation skips live-registry tests via -m "not integration" in pyproject.toml. To run them, set three environment variables and invoke with -m integration:

export EIDR_TEST_USER_ID="10.5238/yourusername"
export EIDR_TEST_PARTY_ID="10.5237/AAAA-BBBB"
export EIDR_TEST_PASSWORD="your-sandbox-password"
pytest -m integration

Integration coverage is intentionally minimal and non-destructive: anonymous resolve, authenticated resolve, status-lookup with a synthetic token (verifies error mapping), one Match (the registry recognizes it as a duplicate of itself, so no state changes), and one ID-only query. No register/modify/delete in the automatic lane — those mutate state and require dedicated test-party infrastructure.

Pipeline gates

ruff check src tests         # lint
ruff format --check src tests  # formatting
mypy src/eidr                # strict type-check
pytest                       # unit tests

License

Apache 2.0 — see LICENSE.

Copyright © 2026 Entertainment Identifier Registry.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eidr-0.1.0rc1.tar.gz (738.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eidr-0.1.0rc1-py3-none-any.whl (391.9 kB view details)

Uploaded Python 3

File details

Details for the file eidr-0.1.0rc1.tar.gz.

File metadata

  • Download URL: eidr-0.1.0rc1.tar.gz
  • Upload date:
  • Size: 738.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for eidr-0.1.0rc1.tar.gz
Algorithm Hash digest
SHA256 d49c6ddb8873f2066ead8ba3bea5acacf26256083cb8a05aaa5fba1d7b5cae98
MD5 8d39bd78495dc76c113f61c1f295efa9
BLAKE2b-256 82c08eef3982f1e05d7a7b76843bb2171d15aec7542e8a02e25084f8d7a487a5

See more details on using hashes here.

File details

Details for the file eidr-0.1.0rc1-py3-none-any.whl.

File metadata

  • Download URL: eidr-0.1.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 391.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for eidr-0.1.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 b1590413da236e31bc10f1c45037897a7440cbee568c754459411ff76b65e10a
MD5 0998d3350b853769d4e7ef358c9c0528
BLAKE2b-256 3fc8309a80b9dfed884d899b10ef33e02fbdf6fd70923915752c559c0404c63c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page