Reference vocabulary and pydantic data model for media cataloguing.

These details have not been verified by PyPI

Project links

Homepage

Project description

mediavocab

Reference vocabulary and pydantic data model for cataloguing media works: movies, music, books, comics, games, podcasts, audio dramas, radio, sound effects, and ambient soundscapes — all in a single shared schema.

mediavocab is a foundation library. It defines the vocabulary (enums, genre constants) and the structural models (Work, Release, Entity, Credit, Membership, Appearance). Application logic — provider clients, resolvers, playback, UI — lives outside this package.

Install

pip install mediavocab

The only runtime dependency is pydantic>=2. The taxonomy/ and text/ subpackages import nothing beyond the stdlib, so they are safe to vendor in minimal environments.

Quickstart

from mediavocab import MediaType, Work, Release, VariantKind
from mediavocab.helpers import make_movie, make_release
from mediavocab.text import score, work_hash

work = make_movie("Blade Runner", year=1982, runtime=117 * 60.0,
                  director="Ridley Scott")
theatrical = make_release(work, "file:///library/blade-runner/theatrical.mkv")
directors  = make_release(work, "file:///library/blade-runner/directors.mkv",
                          variant_kind=VariantKind.DIRECTORS)

print(work_hash(work))                           # stable identity hash
print(score(work, work))                         # 1.0 (self-match)
print(work.model_dump_json())                    # pydantic JSON

More walked-through examples in examples/ covering albums, band lineups, radio stations, IoT device routing, work comparison, and the NOT_MEDIA classifier sentinel.

What's in the box

Module	Contents
`mediavocab.taxonomy`	`MediaType`, `VariantKind`, `EntityKind`, `RelationRole`, `CreditSection`, `MembershipStatus`, `ReleaseStatus`, `StreamMode`, `WorkRelationKind`, `PlaybackModality`, plus `GENRE_*` string constants. Zero deps.
`mediavocab.models`	`Work`, `Release`, `Appearance`, `WorkRelation`, `ReleaseRelation`, `Entity`, `EntityRef`, `Membership`, `Credit`, `Programme`, `Schedule`, `License`. Pydantic v2.
`mediavocab.text`	Normalisation, fuzzy matching, work comparison/scoring, ISO 639/3166 helpers. Stdlib only.
`mediavocab.helpers`	Convenience builders and classifier predicates. Non-normative.

Design highlights

A type earns its place by changing the schema. SOUND_EFFECT, AMBIENT_SOUNDS, AUDIO_DRAMA, MUSIC_VIDEO, etc. each catalogue against different external databases or with different runtime tolerances.
Devices are entities, not works. EntityKind.DEVICE represents physical playback endpoints (smart speakers, smart plugs, cast targets). The Work is still a RADIO/MOVIE/MUSIC; the device is how the consumer routes playback.
NOT_MEDIA is a terminal sentinel for the classifier — distinct from GENERIC, which is a transient "type unknown, may resolve" state.
Work is canonical, Release is the manifestation. A director's cut is a different Release of the same Work. A bootleg is a different Release of the same Work. The Work's identity hash never depends on Release metadata.
PlaybackModality is orthogonal to MediaType. AUDIO / VIDEO / TEXT / INTERACTIVE routes resolver dispatch by playback intent. A Signals(modality=AUDIO) query never touches video-only providers, even if medium=GENERIC. Declare modality: ClassVar[Set[PlaybackModality]] on each provider; empty means universal.
Genre is a free List[str] with canonical spellings in mediavocab.taxonomy.genre. ASMR, ambient, anime, adult, etc. are genre tags applied across multiple media types — not types of their own.

See docs/ for full reference and pattern guides.

Workspace position

mediavocab sits at the bottom of the stack. Every other package in this workspace depends on it:

                          mediavocab
                              ▲
        ┌───────────┬─────────┼─────────┬───────────┐
        │           │         │         │           │
      tutubo   pyfanedit   pymetal   pyo*…       py_bandcamp / nuvem-de-som
        ▲           ▲         ▲                       ▲
        └────────┬──┴─────────┴───────────────────────┘
                 │
              metadatarr  ◄── canonical resolver, ships every provider above
                 ▲
                 │
           media-archivist  ◄── source-DB orchestrator + sidecars + CLI/server

mediavocab: vocabulary + structural models (this package).
tutubo, pyfanedit, pymetal, py_bandcamp, nuvem_de_som, radiosoma, tunein, audiobooker: API clients / scrapers. Each emits mediavocab.Work / Release / Entity directly.
metadatarr: cross-source resolver framework. Bundles every first-party scraper as a hard runtime dep (no extras juggling) and ships ~24 providers under metadatarr.resolve.providers.
media-archivist: local source-DB indexer / canonicalizer / CLI / web server. Consumes metadatarr's resolver.

Testing

pip install -e ".[test]"
pytest -q

License

Apache 2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.1

May 7, 2026

This version

0.1.1a1 pre-release

May 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mediavocab-0.1.1a1.tar.gz (87.2 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mediavocab-0.1.1a1-py3-none-any.whl (182.4 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file mediavocab-0.1.1a1.tar.gz.

File metadata

Download URL: mediavocab-0.1.1a1.tar.gz
Upload date: May 7, 2026
Size: 87.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mediavocab-0.1.1a1.tar.gz
Algorithm	Hash digest
SHA256	`2086f024927ce9821ba257aab4bbf6914a2d3824b6049e95e8e64ee58f28ba34`
MD5	`728844986a26ae126199b405275509f3`
BLAKE2b-256	`9785dc1c86f348d8c8021a29cc109643bf2a800daec320cf6c39d33d5288aa3e`

See more details on using hashes here.

File details

Details for the file mediavocab-0.1.1a1-py3-none-any.whl.

File metadata

Download URL: mediavocab-0.1.1a1-py3-none-any.whl
Upload date: May 7, 2026
Size: 182.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mediavocab-0.1.1a1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`61bfe7ec08f4e837cb91a574de1ffbb6584eb337e64a5396e5385004a49f66e7`
MD5	`00f58fa2a908b5df3994deb98a7ae067`
BLAKE2b-256	`d8979e27eb0f3389b6587b9413ba8147e343c15ea76eea1516e26eb1936eff40`

See more details on using hashes here.

mediavocab 0.1.1a1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mediavocab

Install

Quickstart

What's in the box

Design highlights

Workspace position

Testing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes