The datacontract CLI is an open source command-line tool for working with Data Contracts. It uses data contract YAML files to lint the data contract, connect to data sources and execute schema and quality tests, detect breaking changes, and export to different formats. The tool is written in Python. It can be used as a standalone CLI tool, in a CI/CD pipeline, or directly as a Python library.

These details have not been verified by PyPI

Project links

Homepage

Project description

Data Contract CLI

The datacontract CLI is an open-source command-line tool for working with data contracts. It natively supports the Open Data Contract Standard to lint data contracts, connect to data sources and execute schema and quality tests, and export to different formats. The tool is written in Python. It can be used as a standalone CLI tool, in a CI/CD pipeline, or directly as a Python library.

Main features of the Data Contract CLI

📖 Full documentation: docs.datacontract.com

Quick links: Quickstart · Commands · Best Practices · Custom Export and Import · Development Setup

Getting started

Let's look at this data contract: https://datacontract.com/orders-v1.odcs.yaml

We have a servers section with endpoint details to a Postgres database, schema for the structure and semantics of the data, service levels and quality attributes that describe the expected freshness and number of rows.

This data contract contains all information to connect to the database and check that the actual data meets the defined schema specification and quality expectations. We can use this information to test if the actual data product is compliant to the data contract.

Let's use uv to install the CLI (or use the Docker image),

$ uv tool install --python python3.11 --upgrade 'datacontract-cli[all]'

Now, let's run the tests:

$ export DATACONTRACT_POSTGRES_USERNAME=datacontract_cli.egzhawjonpfweuutedfy
$ export DATACONTRACT_POSTGRES_PASSWORD=jio10JuQfDfl9JCCPdaCCpuZ1YO
$ datacontract test https://datacontract.com/orders-v1.odcs.yaml

# returns:
Testing https://datacontract.com/orders-v1.odcs.yaml
Server: production (type=postgres, host=aws-1-eu-central-2.pooler.supabase.com, port=6543, database=postgres, schema=dp_orders_v1)
╭────────┬──────────────────────────────────────────────────────────┬─────────────────────────┬─────────╮
│ Result │ Check                                                    │ Field                   │ Details │
├────────┼──────────────────────────────────────────────────────────┼─────────────────────────┼─────────┤
│ passed │ Check that field 'line_item_id' is present               │ line_items.line_item_id │         │
│ passed │ Check that field line_item_id has type UUID              │ line_items.line_item_id │         │
│ passed │ Check that field line_item_id has no missing values      │ line_items.line_item_id │         │
│ passed │ Check that field 'order_id' is present                   │ line_items.order_id     │         │
│ passed │ Check that field order_id has type UUID                  │ line_items.order_id     │         │
│ passed │ Check that field 'price' is present                      │ line_items.price        │         │
│ passed │ Check that field price has type INTEGER                  │ line_items.price        │         │
│ passed │ Check that field price has no missing values             │ line_items.price        │         │
│ passed │ Check that field 'sku' is present                        │ line_items.sku          │         │
│ passed │ Check that field sku has type TEXT                       │ line_items.sku          │         │
│ passed │ Check that field sku has no missing values               │ line_items.sku          │         │
│ passed │ Check that field 'customer_id' is present                │ orders.customer_id      │         │
│ passed │ Check that field customer_id has type TEXT               │ orders.customer_id      │         │
│ passed │ Check that field customer_id has no missing values       │ orders.customer_id      │         │
│ passed │ Check that field 'order_id' is present                   │ orders.order_id         │         │
│ passed │ Check that field order_id has type UUID                  │ orders.order_id         │         │
│ passed │ Check that field order_id has no missing values          │ orders.order_id         │         │
│ passed │ Check that unique field order_id has no duplicate values │ orders.order_id         │         │
│ passed │ Check that field 'order_status' is present               │ orders.order_status     │         │
│ passed │ Check that field order_status has type TEXT              │ orders.order_status     │         │
│ passed │ Check that field 'order_timestamp' is present            │ orders.order_timestamp  │         │
│ passed │ Check that field order_timestamp has type TIMESTAMPTZ    │ orders.order_timestamp  │         │
│ passed │ Check that field 'order_total' is present                │ orders.order_total      │         │
│ passed │ Check that field order_total has type INTEGER            │ orders.order_total      │         │
│ passed │ Check that field order_total has no missing values       │ orders.order_total      │         │
╰────────┴──────────────────────────────────────────────────────────┴─────────────────────────┴─────────╯
🟢 data contract is valid. Run 25 checks. Took 3.938887 seconds.

Voilà, the CLI tested that the YAML itself is valid, all records comply with the schema, and all quality attributes are met.

We can also use the data contract metadata to export in many formats, e.g., to generate a SQL DDL:

$ datacontract export sql https://datacontract.com/orders-v1.odcs.yaml

# returns:
-- Data Contract: orders
-- SQL Dialect: postgres
CREATE TABLE orders (
  order_id None not null primary key,
  customer_id text not null,
  order_total integer not null,
  order_timestamp None,
  order_status text
);
CREATE TABLE line_items (
  line_item_id None not null primary key,
  sku text not null,
  price integer not null,
  order_id None
);

Or generate an HTML export:

$ datacontract export html --output orders-v1.odcs.html https://datacontract.com/orders-v1.odcs.yaml

Usage

# create a new data contract from example and write it to odcs.yaml
$ datacontract init odcs.yaml

# edit the data contract in the Data Contract Editor (web UI)
$ datacontract edit odcs.yaml

# lint the odcs.yaml and stop after the first validation error (default).
$ datacontract lint odcs.yaml

# show a changelog between two data contracts
$ datacontract changelog v1.odcs.yaml v2.odcs.yaml

# execute schema and quality checks (define credentials as environment variables)
$ datacontract test odcs.yaml

# generate dbt tests from a contract into your dbt project, then run them
# (omit the contract to sync/test every *.odcs.yaml in the project)
$ datacontract dbt sync orders.odcs.yaml --project-dir ./warehouse
$ datacontract dbt test orders.odcs.yaml --project-dir ./warehouse

# export data contract as html (other formats: avro, dbt-models, dbt-sources, dbt-staging-sql, jsonschema, odcs, rdf, sql, sodacl, terraform, ...)
$ datacontract export html datacontract.yaml --output odcs.html

# import sql (other formats: avro, glue, bigquery, jsonschema, excel ...)
$ datacontract import sql --source my-ddl.sql --dialect postgres --output odcs.yaml

# import from Excel template
$ datacontract import excel --source odcs.xlsx --output odcs.yaml

# export to Excel template  
$ datacontract export excel --output odcs.xlsx odcs.yaml

Programmatic (Python)

from datacontract.data_contract import DataContract

data_contract = DataContract(data_contract_file="odcs.yaml")
run = data_contract.test()
if not run.has_passed():
    print("Data quality validation failed.")
    # Abort pipeline, alert, or take corrective actions...

How to

Installation

Choose the most appropriate installation method for your needs:

uv

The preferred way to install is uv:

uv tool install --python python3.11 --upgrade 'datacontract-cli[all]'

uvx

If you have uv installed, you can run datacontract-cli directly without installing:

uv run --with 'datacontract-cli[all]' datacontract --version

pip

Python 3.10, 3.11, and 3.12 are supported. We recommend using Python 3.11.

python3 -m pip install 'datacontract-cli[all]'
datacontract --version

pip with venv

Typically it is better to install the application in a virtual environment for your projects:

cd my-project
python3.11 -m venv venv
source venv/bin/activate
pip install 'datacontract-cli[all]'
datacontract --version

pipx

pipx installs into an isolated environment.

pipx install 'datacontract-cli[all]'
datacontract --version

Docker

You can also use our Docker image to run the CLI tool. It is also convenient for CI/CD pipelines.

docker pull datacontract/cli
docker run --rm -v ${PWD}:/home/datacontract datacontract/cli

You can create an alias for the Docker command to make it easier to use:

alias datacontract='docker run --rm -v "${PWD}:/home/datacontract" datacontract/cli:latest'

Note: The output of Docker command line messages is limited to 80 columns and may include line breaks. Don't pipe docker output to files if you want to export code. Use the --output option instead.

Optional Dependencies (Extras)

The CLI tool defines several optional dependencies (also known as extras) that can be installed for using with specific servers types. With all, all server dependencies are included.

uv tool install --python python3.11 --upgrade 'datacontract-cli[all]'

A list of available extras:

Dependency	Installation Command
Amazon Athena	`pip install datacontract-cli[athena]`
Avro Support	`pip install datacontract-cli[avro]`
Azure Integration	`pip install datacontract-cli[azure]`
Google BigQuery	`pip install datacontract-cli[bigquery]`
CSV	`pip install datacontract-cli[csv]`
Databricks Integration	`pip install datacontract-cli[databricks]`
DBML	`pip install datacontract-cli[dbml]`
DuckDB (local/S3/GCS/Azure file testing)	`pip install datacontract-cli[duckdb]`
Excel	`pip install datacontract-cli[excel]`
GCS Integration	`pip install datacontract-cli[gcs]`
Iceberg	`pip install datacontract-cli[iceberg]`
Impala	`pip install datacontract-cli[impala]`
Kafka Integration	`pip install datacontract-cli[kafka]`
MySQL Integration	`pip install datacontract-cli[mysql]`
Oracle	`pip install datacontract-cli[oracle]`
Parquet	`pip install datacontract-cli[parquet]`
PostgreSQL Integration	`pip install datacontract-cli[postgres]`
protobuf	`pip install datacontract-cli[protobuf]`
RDF	`pip install datacontract-cli[rdf]`
Amazon Redshift	`pip install datacontract-cli[redshift]`
S3 Integration	`pip install datacontract-cli[s3]`
Snowflake Integration	`pip install datacontract-cli[snowflake]`
Microsoft SQL Server	`pip install datacontract-cli[sqlserver]`
Trino	`pip install datacontract-cli[trino]`
API (run as web server)	`pip install datacontract-cli[api]`

Documentation

📖 The full documentation is at docs.datacontract.com.

It covers everything in depth, including the complete command reference:

Quickstart — install and run your first test
Open Data Contract Standard — the contract format
Test your contract and Connect your Data — schema & quality tests against 18+ data sources
Define your Quality Rules — SQL, library, text, and custom checks
Sync with dbt · Edit your contract
Imports and Exports — convert to/from 25+ formats
API and Python Library
Command reference — init, lint, test, export, import, dbt, ci, catalog, publish, api, and more

Development Setup

Install uv
Python base interpreter should be 3.11.x.
A JDK (17 or 21) must be installed for the Spark-based tests (e.g. test_test_kafka.py, test_test_delta.py, test_test_dataframe.py, test_import_spark.py). Java 25 is not yet supported. On macOS and Linux you can install one with SDKMAN: sdk install java 21.0.11-tem (or any 21.x build from sdk list java). Verify with java --version.
Docker engine must be running to execute the tests.

sdk use java 21.0.11-tem
uv python pin 3.11
uv venv
uv pip install -e '.[dev]'
uv run ruff check
uv run pytest

Contribution

We are happy to receive your contributions. Propose your change in an issue or directly create a pull request with your improvements.

Before creating a pull request, please make sure that all tests are passing (uv run pytest) and your code is properly formatted (ruff format). Create a changelog entry and reference fixed issues (if any).

Troubleshooting

Windows: Some tests fail

Run in WSL. (We need to fix the paths in the tests so that normal Windows will work, contributions are appreciated)

PyCharm does not pick up the `.venv`

This uv issue might be relevant.

Try to sync all groups:

uv sync --all-groups --all-extras

Errors in tests that use PySpark (e.g. test_test_kafka.py)

Ensure you have a JDK 17 or 21 installed. Java 25 causes issues.

java --version

Docker Build

docker build -t datacontract/cli .
docker run --rm -v ${PWD}:/home/datacontract datacontract/cli

Docker compose integration

We've included a docker-compose.yml configuration to simplify the build, test, and deployment of the image.

Building the Image with Docker Compose

To build the Docker image using Docker Compose, run the following command:

docker compose build

This command utilizes the docker-compose.yml to build the image, leveraging predefined settings such as the build context and Dockerfile location. This approach streamlines the image creation process, avoiding the need for manual build specifications each time.

Testing the Image

After building the image, you can test it directly with Docker Compose:

docker compose run --rm datacontract --version

This command runs the container momentarily to check the version of the datacontract CLI. The --rm flag ensures that the container is automatically removed after the command executes, keeping your environment clean.

Related Tools

Entropy Data is a commercial tool to manage data contracts. It contains a web UI, access management, and data governance for a data product marketplace based on data contracts.
Data Contract Editor is an editor for Data Contracts, including a live html preview.

License

MIT License

Credits

Created by Stefan Negele, Jochen Christ, and Simon Harrer.

Legal Notice · Privacy Policy

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.13

Jul 14, 2026

1.0.12

Jul 10, 2026

1.0.11

Jul 10, 2026

1.0.10

Jul 8, 2026

1.0.9

Jun 26, 2026

1.0.8

Jun 25, 2026

1.0.7

Jun 25, 2026

1.0.6

Jun 24, 2026

1.0.5

Jun 24, 2026

1.0.4

Jun 22, 2026

1.0.3

Jun 15, 2026

1.0.2

Jun 11, 2026

1.0.1

Jun 10, 2026

1.0.0

Jun 4, 2026

0.12.5

May 30, 2026

0.12.4

May 21, 2026

0.12.3

May 18, 2026

0.12.2

May 5, 2026

0.12.1

Apr 21, 2026

0.12.0

Apr 20, 2026

0.11.9

Apr 20, 2026

0.11.8

Apr 10, 2026

0.11.7

Mar 24, 2026

0.11.6

Mar 18, 2026

0.11.5

Feb 19, 2026

0.11.4

Jan 19, 2026

0.11.3

Jan 10, 2026

0.11.2

Dec 15, 2025

0.11.1

Dec 14, 2025

0.10.41

Dec 4, 2025

0.10.40

Nov 25, 2025

0.10.39

Nov 20, 2025

0.10.38

Nov 11, 2025

0.10.37

Nov 3, 2025

0.10.36

Oct 17, 2025

0.10.35

Aug 25, 2025

0.10.34

Aug 6, 2025

0.10.33

Jul 29, 2025

0.10.32

Jul 28, 2025

0.10.31

Jul 18, 2025

0.10.30

Jul 15, 2025

0.10.29

Jul 6, 2025

0.10.28

Jun 5, 2025

0.10.27

May 22, 2025

0.10.26

May 16, 2025

0.10.25

May 7, 2025

0.10.24

Apr 19, 2025

0.10.23

Mar 3, 2025

0.10.22

Feb 20, 2025

0.10.21

Feb 6, 2025

0.10.20

Jan 30, 2025

0.10.19

Jan 29, 2025

0.10.18

Jan 18, 2025

0.10.16

Dec 19, 2024

0.10.15

Dec 2, 2024

0.10.14

Oct 26, 2024

0.10.13

Sep 20, 2024

0.10.12

Sep 8, 2024

0.10.11

Aug 8, 2024

0.10.10

Jul 18, 2024

0.10.9

Jul 3, 2024

0.10.8

Jun 19, 2024

0.10.7

May 31, 2024

0.10.6

May 29, 2024

0.10.5

May 29, 2024

0.10.4

May 17, 2024

0.10.3

May 5, 2024

0.10.2

May 5, 2024

0.10.1

Apr 20, 2024

0.10.0

Apr 19, 2024

0.9.9

Apr 18, 2024

0.9.8

Apr 1, 2024

0.9.7

Mar 15, 2024

0.9.6.post2

Mar 4, 2024

0.9.6

Mar 4, 2024

0.9.5

Feb 22, 2024

0.9.4

Feb 18, 2024

0.9.3

Feb 10, 2024

0.9.2

Jan 30, 2024

0.9.1rc9 pre-release

Jan 28, 2024

0.9.0

Jan 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacontract_cli-1.0.13.tar.gz (3.1 MB view details)

Uploaded Jul 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datacontract_cli-1.0.13-py3-none-any.whl (3.0 MB view details)

Uploaded Jul 14, 2026 Python 3

File details

Details for the file datacontract_cli-1.0.13.tar.gz.

File metadata

Download URL: datacontract_cli-1.0.13.tar.gz
Upload date: Jul 14, 2026
Size: 3.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datacontract_cli-1.0.13.tar.gz
Algorithm	Hash digest
SHA256	`23306c521279d5ce254a79be3ae8ade5bef4c407fb6130a5c0948d927347cc39`
MD5	`180c2b5dcc8269f8ccf836275c88aef6`
BLAKE2b-256	`7f7ee8b461e0f1988b58ddea8a04952274c4ec705799f63c57325e44e15f923e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacontract_cli-1.0.13.tar.gz:

Publisher: release.yaml on datacontract/datacontract-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datacontract_cli-1.0.13.tar.gz
- Subject digest: 23306c521279d5ce254a79be3ae8ade5bef4c407fb6130a5c0948d927347cc39
- Sigstore transparency entry: 2169401476
- Sigstore integration time: Jul 14, 2026
Source repository:
- Permalink: datacontract/datacontract-cli@7589fe4c87b2e3a5f2fecf954ec612499e67d939
- Branch / Tag: refs/tags/v1.0.13
- Owner: https://github.com/datacontract
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@7589fe4c87b2e3a5f2fecf954ec612499e67d939
- Trigger Event: push

File details

Details for the file datacontract_cli-1.0.13-py3-none-any.whl.

File metadata

Download URL: datacontract_cli-1.0.13-py3-none-any.whl
Upload date: Jul 14, 2026
Size: 3.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datacontract_cli-1.0.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8dcc6c189d8bf387fcaf8897beb0738d953a7aba3149d73f9b200a9878e3d1d8`
MD5	`faca9c40e2bf34f750ef5e63fcbca37a`
BLAKE2b-256	`9cf65dc839b7a9565368525cfb192536289e7ba650bc3bd940aba390dda66ec0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacontract_cli-1.0.13-py3-none-any.whl:

Publisher: release.yaml on datacontract/datacontract-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: datacontract_cli-1.0.13-py3-none-any.whl
- Subject digest: 8dcc6c189d8bf387fcaf8897beb0738d953a7aba3149d73f9b200a9878e3d1d8
- Sigstore transparency entry: 2169401498
- Sigstore integration time: Jul 14, 2026
Source repository:
- Permalink: datacontract/datacontract-cli@7589fe4c87b2e3a5f2fecf954ec612499e67d939
- Branch / Tag: refs/tags/v1.0.13
- Owner: https://github.com/datacontract
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@7589fe4c87b2e3a5f2fecf954ec612499e67d939
- Trigger Event: push

datacontract-cli 1.0.13

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Data Contract CLI

Getting started

Usage

Programmatic (Python)

How to

Installation

uv

uvx

pip

pip with venv

pipx

Docker

Optional Dependencies (Extras)

Documentation

Development Setup

Contribution

Troubleshooting

Windows: Some tests fail

PyCharm does not pick up the .venv

Errors in tests that use PySpark (e.g. test_test_kafka.py)

Docker Build

Docker compose integration

Building the Image with Docker Compose

Testing the Image

Related Tools

License

Credits

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

PyCharm does not pick up the `.venv`