Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.6+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.3.20221016151607.tar.gz (531.5 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.3.20221016151607-py3-none-any.whl (567.0 kB view details)

Uploaded Python 3

schema_salad-8.3.20221016151607-cp310-cp310-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221016151607-cp39-cp39-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221016151607-cp38-cp38-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221016151607-cp37-cp37m-musllinux_1_1_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file schema-salad-8.3.20221016151607.tar.gz.

File metadata

File hashes

Hashes for schema-salad-8.3.20221016151607.tar.gz
Algorithm Hash digest
SHA256 c61a517c945cd383445ebecbc085b00766913a3fdd7df9699a0fa4c41728ca1f
MD5 d7ed0d1c48a2ce4738600e66e30e193b
BLAKE2b-256 12304a327034f6869e3c526fd28a99874e7694feba0b956a82fcc606877ea4cc

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-py3-none-any.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-py3-none-any.whl
Algorithm Hash digest
SHA256 6fb06acac5fc1cb2e063128e2cd29514dabdb6930d563fa88b677fdb30cc3713
MD5 dc02599f64560e557b0bf825b04799a3
BLAKE2b-256 19aad14debb172527fd27e3da4ce821cc4b005415921b0440738d85c55e715b1

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 fe435566b4dace07dcca8e5a21412db537a40fc8316518bbffc489ce160bf5d4
MD5 6b72c3df73fdb51b5f8174ccb81294ce
BLAKE2b-256 81b05512046ac7c5c1e3f7c959bb9e6bec6354ffb91dc0f7faf17be325033970

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f34bc764086a3ab3b95d03bebe72bf447053f834cd22fc2902df4ab183971c4
MD5 1aec1eb331b93cbecc1d5d997e7992e4
BLAKE2b-256 7bcfa837df2e1b116dcfe3c0de82bd08ef94d3b93ea5fbe5bd9997b1696b15fe

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e033a59e82e7cea8ffceca55a3d454959dbb7485cd3bd88c2e5a7c0358c297c5
MD5 56201cdce71966e3a3f831fe040557a4
BLAKE2b-256 8abc3eaf8ce4bb41a5cebd52e1d66b4f5c08492578ebb9d11e857585df55a76e

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 0f4208be4caf434d60605e848365e5d7dd413262b8f385f6593afc314d5c27ef
MD5 d89e597f2e954edefe57fe2980c22344
BLAKE2b-256 91eb604941dc03891d213ba4129576ac72fc443401f85cf8123157ab036f6f4c

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 308f6e0ce016ea935beceef9c655cb48d8d1b7101ae290409d866598a98826c2
MD5 09026c11dd256b41e9c0aeaf14d584c6
BLAKE2b-256 a36dc5386c158a96b87f1f5ec1f4a8efbd9f1f33125bc793258547c904740485

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c92d9705c24c15aa04c60c9928c963b42a32e2f4bf8884f0f0517852ec44bfcb
MD5 8e6ee2907699dcbef1557de0f1d7a9cc
BLAKE2b-256 b1dffea5a5d15367b73cb9a418d1a683984f13d4dd03a510a1acf7d19debf1ad

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0884989bb6b2dbe3f7d366e23fe7ee58b1318072db1df74ca02178028f9d02ac
MD5 2d0f56bb0e26a06be702251cd8bb38be
BLAKE2b-256 8dabaedf9e548d0d5e150cf4034edd81e66fc56668e1ab72566aa3b64189aeb4

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 6c98e312e7b2b39fd7be7d4dfc24a8be2ea86a3b05576b7ebf4684b8de73e17e
MD5 52a46e635c772c75e264d73ce79af608
BLAKE2b-256 5f5319658fc523fe3ad62250874fe9159861d3caaa878eb7e6804ec770d7c3d5

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 36720b20e24a51ead7f7dee99577da88088c4354c93821da2238c4c34bbaf6f2
MD5 1fcd7b5827f2c54630d09bd207e27bb9
BLAKE2b-256 31f923827c6df80dec0eeb27a67668b98cc11b5a82e46e102cd18d915232cc58

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d259f7ac3ab3cbea792bfa22565ef86f0395b0fd398aac8a5bd9d67b8bde4122
MD5 580b9a76b7af1c82b263403d6b500a50
BLAKE2b-256 29202a2399941f5e700e0be15d13755f5eb61a6b712e8680e861e67efb21f2c3

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ad4cafd1939fd989b0463d5937e56b283762c8328361829097bb17a8629cabc
MD5 3b0197bc23f3d3fd5a6c56aec0645e2a
BLAKE2b-256 a8a804a156a94b305f4bb643b58a3bce66bf806432d04796624dc94201ca955b

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 cee798b9d24c19cc5622b0f83041328f2e0ef4762677e4ac7d9333e6c49d725d
MD5 68b97016d2c4c4dd4f3df9757d20661d
BLAKE2b-256 25822d682996b876644764d2da721daeabd82401c60c1790f50533cb62eb0eae

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp37-cp37m-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 24e7d868eefaf24ca2002c02ca83d5c4580208b6ac7beda713f18b41408889b8
MD5 9a853340d673bd3322321b3af56ed23a
BLAKE2b-256 596d713863a29055c9e64ec22abea5d916fdcd349e0991569c8f4c6960cefa08

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8ebb3213d26cb7da53596bc27aecb877e22e375be63dcc24dbd5dfe01ff2e78c
MD5 41f947e3daa3323ae05bd57efa78a792
BLAKE2b-256 b51776b8dc5a15af75ccf8456c2a2aa4337802cda0732d9814c0f7add749739a

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8139cb0673c11559c997894b5d98405ec20543ce6fbe7f4ab56a1c719e9b44b7
MD5 9a92212f0669fab65c6b1d09abda441f
BLAKE2b-256 d0584ab3f5934ec05222e43aa183b47cc2500f66f15d177050e62ecfb49ec0aa

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221016151607-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 76d55f5f558812e19f505c9dd47166f05a1814a5c057b6dfa41507fc1174579e
MD5 e7721b01d041b34877b21e5f0670a5d8
BLAKE2b-256 dba1dd9a6856a39a2b70dff025786dff2731a8babd72298f92a37cc30c192dd0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page