Skip to main content

One data model, many formats: read, validate, and write JSON, YAML, TOML, XML, and OML.

Project description

Omnist

tests python license status

Omnist ("omni-structure") is one canonical data model for JSON, YAML, TOML, XML, and its own native OML (Omnist Markup Language) — read any of them into a single tree, validate it against a schema, compare schema versions, and write it back out to any of the others.

from omnist import parse_schema, doc

s = parse_schema('''
    record Member { "name": string, "role": string }
    record Team   { "name": string, "members" [1,]: Member }
    root Team
''')

s.validate(doc({"name": "Platform",
                "members": [{"name": "Ann", "role": "dev"}]})).ok    # True

Why Omnist

If your service handles config or payloads in more than one format, you usually get a different library — and a different mental model — for each. Omnist gives you one model and one schema language over it, grounded in a small, self-contained formal model (inspired by Lee & Cheung, CIKM 2010):

  • A Document is a tree — an ordered list of labeled edges. Arrays are just repeated labels, so the same Document represents JSON, YAML, TOML, XML (including its interleaved repeated elements), and OML — Omnist's own format, the only one with zero loss in either direction.
  • A Schema is named record definitions (closed named fields, each with a cardinality), where every field's type is always exactly one fixed scalar (optionally nullable) or one Ref to a named record — referenced by name for reuse and recursion. Validate a Document, compare two schemas for backward-compatibility, or infer a schema from examples.
  • Restrictive by default — a schema guarantees structure; there are no structureless escape hatches, and scalar types are never composed.

The model is defined formally in docs/design/model.md; see the quickstart for the shortest possible example, or the user guide for the practical tour.

A 60-second tour

from omnist import Doc, parse_schema, infer, doc, read_json

# OML is omnist's own format -- see docs/formats/oml.md
Doc.from_oml('id: 1\ntags: "a"\ntags: "b"\n').to_oml()

# converting from other formats is just read one, write another
Doc.from_json('{"id": 1, "tags": ["a", "b"]}').to_yaml()

# describe a shape and check data against it; errors carry exact paths
s = parse_schema('record R { "id": integer, "tags" [0,]: string }\nroot R')
print(s.validate(doc({"id": "x", "tags": ["a"]})))
#   invalid:
#     at $.id: expected integer, got string ('x')

# learn a schema from examples
print(infer([doc({"id": 1, "tags": ["a"]})]).to_dsl())
#   record Root {
#       "id": integer,
#       "tags": string,
#   }
#   root Root

# is a schema change backward-compatible? (operations are Schema methods)
v1 = parse_schema('record R { "host": string }\nroot R')
v2 = parse_schema('record R { "host": string, "port" [0,1]: integer }\nroot R')
v1.compatible_with(v2)        # True -- adding an optional field is safe

# schema-directed deserialization: upgrade leaves to match the schema
s2 = parse_schema('record R { "d": date }\nroot R')
read_json('{"d": "2024-01-01"}', schema=s2)   # [('d', datetime.date(2024, 1, 1))]

Run the full demo: python3 examples/canonical_model.py.

Installation

Requires Python 3.11+ (it uses the standard-library tomllib). The core and JSON support have no dependencies.

pip install omnist                      # core + JSON
pip install omnist[all]                 # + pyyaml, tomli_w, defusedxml

Or from a checkout:

git clone https://github.com/tomlee/omnist.git
cd omnist
python3 -m venv .venv && source .venv/bin/activate
pip install .                    # core + JSON
pip install pyyaml tomli_w defusedxml   # YAML / writing TOML / hardened XML

Documentation

Full index: docs/.

  • Quickstart — the shortest possible example: one OML snippet, one schema, validate(), infer().
  • User guide — the practical tour: documents, OML (the native format), the schema DSL, the Python builder, validation, operations, other codecs, inference.
  • OML — Omnist's own format, designed alongside the model so every Document round-trips with zero adjustments, and how it maps onto the Python Document.
  • The Schema model & DSL — Omnist's other central feature: record definitions, cardinality, the Python builder, and the comparison/inference operations.
  • API reference — every public name, with signatures.
  • Schema-directed deserialization — what changes (and what doesn't) about a Document's Python types when a schema is, vs. isn't, passed to a reader.
  • A real-life example — one order schema validated against documents in JSON, YAML, TOML, and XML, plus a compatibility check.
  • Formats — how each format maps to the model and its caveats (OML · JSON · YAML · TOML · XML).
  • Model spec — the formal Document and Schema models, self-contained and plain (no paper required).
  • Glossary — one definition per term used across the docs and code, grouped by concept area.
  • Testing — the test suite layout, coverage tooling and target, the fuzzing approach, and what CI runs.
  • Repo layout — how the repo itself is organized: omnist/canonical/*.py module responsibilities, the docs page map, and the test file map.

Status

Omnist is alpha (v0.2.0), built around a small, self-contained formalism; the public API may still change before a stable release.

Feedback and bug reports welcome: https://github.com/tomlee/omnist/issues. See SECURITY.md for the trust model if you parse untrusted input.

License

Apache-2.0 — see LICENSE and NOTICE.

Background

The model is inspired by Lee & Cheung, "XML Schema Computations: Schema Compatibility Testing and Subschema Extraction" (CIKM 2010), simplified for the JSON family of formats. You don't need the paper to use Omnist — the model spec is self-contained.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnist-0.2.0.tar.gz (67.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omnist-0.2.0-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file omnist-0.2.0.tar.gz.

File metadata

  • Download URL: omnist-0.2.0.tar.gz
  • Upload date:
  • Size: 67.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for omnist-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c5bc195d9a97fddbc6283febf65ca471a84048e470b6e6fea9803f3d21d8551a
MD5 77aa241dbc190ab412765b2194c4f84f
BLAKE2b-256 fe61c9fb874dd1926def21ccde1afc03778f0b316b1f019ce3834af1c0c910e2

See more details on using hashes here.

File details

Details for the file omnist-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: omnist-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for omnist-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1794ae9852513ebf84e8d9fb0009fe73189467a4792889bc1f9f7592da4cb262
MD5 63a70f5fa0f4f9e15a3840137a09dde9
BLAKE2b-256 f4b0bf402a1acc415863c4b6396f742919db81afb2b2341ea86ea0fc186386b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page