Skip to main content

Python SDK and AI agent toolkit for the OpenDataProducts.org standards family, supporting ODPS, ODPC, ODPG, ODPV, MCP, CLI workflows, and LLM-assisted generation

Project description

Open Data Products Python SDK for AI Agents

Open Data Products Python SDK

PyPI version Python Support License: Apache-2.0

An AI-agent-first Python SDK for the OpenDataProducts.org standards family. It gives agents, agent hosts, and automation systems one consistent surface for loading, detecting, validating, explaining, searching, traversing, and summarizing documents across:

The package still includes developer-facing Python helpers, but the primary contract is agent-ready: structured validation results, lightweight artifact summaries, reference discovery, Data Contract orchestration, bundled retrieval resources, a unified CLI, an MCP stdio server, and an ARWS agent manifest.

Installation

pip install open-data-products

# Optional Data Contract validation adapter:
pip install "open-data-products[contracts]"

# For development:
pip install "open-data-products[dev]"

AI Agent-First SDK

Why Agent First

  • One cross-spec entry point: Agents can call load_document, validate_document, explain_document, and resolve_references across ODPS, ODPC, ODPG, and ODPV files.
  • Structured outputs: Validation, references, resources, summaries, and graph reasoning helpers return predictable objects that are easy for agents to inspect.
  • Small-context workflows: load_summary returns metadata, size, hash, spec, kind, and id without returning full document bodies.
  • Retrieval-ready resources: Bundled schemas, prompt templates, vocabulary records, catalog object records, and graph object records are discoverable through list_resources and MCP tools.
  • Agent-ready ODPC and ODPV helpers: Catalog building, catalog artifact checks, vocabulary term resolution, canonical term packets, relationship compatibility checks, and term context packets are available through Python, CLI, and MCP surfaces where safe.
  • Graph reasoning for agents: ODPG helpers support graph summaries, traversal, strategic analysis, and trusted focus-node context extraction.
  • Data Contract orchestration: Optional datacontract-cli integration validates external contracts while the SDK resolves ODPS contract references, extracts schemas, checks static product-contract alignment, and returns agent-ready reports.
  • Host integration: MCP-capable tools can launch open-data-products serve, while ARWS-compatible systems can read the generated manifest.

Unified Agent API

Use the top-level API when building AI agents, automation, validation pipelines, or tools that need to work across the Open Data Products standards family without knowing the spec namespace ahead of time:

from open_data_products import (
    explain_document,
    generate_local_artifact,
    generate_local_artifacts,
    load_generation_prompt,
    list_resources,
    load_document,
    resolve_references,
    validate_document,
)

document = load_document("examples/product.yaml")
result = validate_document(document)

print(result.valid, result.spec, result.kind)
print(explain_document(document))

for reference in resolve_references(document):
    print(reference.pointer, reference.ref)

for resource in list_resources():
    print(resource.id, resource.spec, resource.type)

prompt = load_generation_prompt("odps_data_product_fragment.md")
signal = generate_local_artifact(
    "signal",
    "open_data_products/generation/source_docs/turnaround-delay-signal.txt",
    "open_data_products/generation/fragments",
)
all_artifacts = generate_local_artifacts(
    "open_data_products/generation/source_docs",
    "open_data_products/generation/fragments",
)

The top-level CLI exposes the same workflow with machine-readable output:

open-data-products validate examples/product.yaml --json
open-data-products explain examples/product.yaml --json
open-data-products refs graph.yaml --json
open-data-products resources --json
open-data-products summary examples/product.yaml      # lightweight reference: size, hash, spec
open-data-products manifest --json           # ARWS agent manifest
open-data-products serve                     # MCP server over stdio

Data Contract support is optional and product-oriented. The SDK recognizes native ODPS /product/contract references ($ref, contractURL, and inline spec) as well as practical extension-style references such as extensions.dataContract.href. External contract lint/export uses datacontract-cli when installed; inline ODPS contract specs are used for static summaries and alignment without running live source tests.

from open_data_products import (
    check_product_contract_alignment,
    extract_contract_schema,
    generate_product_contract_report,
    resolve_product_contracts,
    summarize_contract,
    validate_contract,
)

for reference in resolve_product_contracts("examples/product.yaml"):
    print(reference.pointer, reference.href)

print(validate_contract("examples/contract.yaml").passed)
print(extract_contract_schema("examples/contract.yaml").field_count)
print(check_product_contract_alignment("examples/product.yaml", "examples/contract.yaml").summary)
print(generate_product_contract_report("examples/product.yaml").summary)

Agent Surface (MCP + ARWS)

Run open-data-products serve to expose the SDK as a local MCP server, or open-data-products manifest --json to render the ARWS manifest. See Agent surface for Codex/Claude Code setup, MCP tools, and bundled skills.

Package Structure

Use open_data_products.<spec> namespaces for every standard:

Namespace Standard Status
open_data_products.odps Open Data Product Specification Implemented
open_data_products.odpc Open Data Product Catalog Catalog helpers implemented
open_data_products.odpg Open Data Product Graph Graph helpers implemented
open_data_products.odpv Open Data Product Vocabulary Vocabulary tools implemented

Capabilities at a Glance

Area What agents and developers can do
Cross-spec API Detect, load, validate, explain, summarize, and resolve references across ODPS, ODPC, ODPG, and ODPV
MCP + ARWS Run a local stdio MCP server, expose safe tools, and generate an ARWS agent manifest
ODPS Create, load, validate, serialize, and inspect ODPS v4.1 data product documents
ODPC Build catalogs from fragments, validate catalogs, explain catalog metadata, search bundled catalog object guidance, and generate/check derived catalog schema artifacts
ODPG Validate graphs, summarize nodes and edges, traverse relationships, analyze governance/strategy signals, and extract agent context
ODPV Load, validate, search, generate vocabulary artifacts, resolve terms and aliases, explain canonical term packets, check relationships, and produce agent context for shared ODP terminology
Data Contracts Resolve ODPS contract references, validate external contracts through optional datacontract-cli, extract schemas, check static alignment, and generate product-level reports
Bundled resources Discover schemas, examples, vocabulary records, catalog object records, and graph object records through the resource registry

ODPS support is scoped to the 4.x generation of the specification. The SDK primarily targets ODPS v4.1 and keeps backward-compatible support for ODPS v4.0 documents.

ODPS field validation includes ISO language, country, currency, date/time, phone, email, and URI formats where those standards apply.

Usage Guide

This README is intentionally a short landing page. Use the focused references below for implementation details:

  • API reference: Agent API, spec helper namespaces, ODPS models, validators, serialization, and examples.
  • Agent surface: MCP server, ARWS manifest, and bundled skills for agent hosts.
  • Command guide: what each common CLI command does, what it reads, and what it writes.
  • LLM generation: Ollama or configured external LLM source-doc to ODPC fragment and ODPG graph workflow.
  • Data Contract workflows: ODPS contract resolution, optional datacontract-cli, alignment, and reports.
  • Capability drift reports: dated SDK alignment reports against upstream specification tooling.
  • Tooling development model: human-facing explanation of how spec-level scripts mature into consolidated SDK capabilities.
  • Functional test report: public API, CLI, and MCP functional coverage matrix.
  • Example scripts: runnable ODPS examples, including v4.1 strategy and MCP access examples.
  • Course-style guides: simple human SDK workflows and LLM generation lessons.
  • Sample apps: independent CLIs built on top of the SDK.
  • Agent handoff: compact machine-readable routing for AI agents.

Common Workflows

Most commands print human-readable output by default; add --json when agents, CI jobs, or scripts need a stable machine-readable response. See the command guide for what each command reads, checks, and produces.

# Cross-spec validation and summaries
open-data-products validate examples/product.yaml --json
open-data-products explain examples/odpc_catalog.yaml --json
open-data-products refs open_data_products/odpg/data/graph/graph.yaml --json
open-data-products summary examples/product.yaml

# Bundled agent resources
open-data-products resources --json
open-data-products resources --id generation.prompt.system --json
open-data-products resources --id odpc.objects --json
open-data-products resources --id odpv.terms --json
open-data-products resources --id odpg.objects --json

The LLM generation commands require Ollama or configured provider credentials.

Use the bundled default config and bundled prompts as-is:

# LLM generation
open-data-products generate \
  --input source_docs/ \
  --output generated/ \
  --json

open-data-products generate \
  --input source_docs/turnaround-delay-signal.txt \
  --kind signal \
  --output generated/ \
  --json

Customize provider, model, or paths with a project-owned config:

open-data-products config generation --copy-to my-generation.config.yaml
open-data-products config generation --config my-generation.config.yaml --print
open-data-products config generation --config my-generation.config.yaml --check

open-data-products generate \
  --config my-generation.config.yaml \
  --input source_docs/ \
  --output generated/ \
  --json

When installed from PyPI, the bundled generation config lives inside the package as a template. Copy it to a project-owned file before editing provider or model settings; do not edit files under site-packages. The my-generation.config.yaml name below is only an example for your copied file. You can also pass a folder path, such as --copy-to config/, and missing folders are created automatically.

Override the configured provider or model for a single run when testing a different LLM:

open-data-products generate \
  --config my-generation.config.yaml \
  --provider groq \
  --model openai/gpt-oss-120b \
  --input source_docs/ \
  --output generated/ \
  --json

open-data-products generate \
  --config my-generation.config.yaml \
  --provider claude \
  --model claude-sonnet-4-5 \
  --input source_docs/turnaround-delay-signal.txt \
  --kind signal \
  --output generated/ \
  --json

Generation uses bundled prompt templates by default. If you want to customize the prompts, copy them to a project-owned folder, edit the Markdown files, and pass that folder with --prompts:

open-data-products config generation --copy-prompts-to prompts/

open-data-products generate \
  --config my-generation.config.yaml \
  --prompts prompts/ \
  --input source_docs/ \
  --output generated/ \
  --json
# Generated fragment artifacts
open-data-products validate open_data_products/generation/fragments/odpg_graph.yaml --json
open-data-products odpg-generate open_data_products/generation/fragments/odpg_graph.yaml --output /tmp/odp-generation-graph.html --json

# ODPC catalog helpers
open-data-products odpc-build examples/odpc_catalog_fragments/ --output /tmp/odp-catalog.yaml --json
open-data-products odpc-build examples/odpc_catalog_fragments/ --output /tmp/odp-catalog.yaml --html /tmp/odp-catalog.html --json
open-data-products odpc-summary /tmp/odp-catalog.yaml --json
open-data-products odpc-search "catalog data" --limit 3 --json

# ODPV vocabulary helpers
open-data-products odpv-summary --json
open-data-products odpv-search "governance policy risk" --limit 3 --json
open-data-products odpv-resolve "reusable data asset" --json
open-data-products odpv-explain DataProduct --json
open-data-products odpv-relationship DataProduct supports UseCase --json
open-data-products odpv-context DataProduct --json

# ODPG graph reasoning
open-data-products odpg-summary open_data_products/odpg/data/graph/graph.yaml
open-data-products odpg-traverse open_data_products/odpg/data/graph/graph.yaml --start AGENT-AVIATION-001 --depth 2
open-data-products odpg-analyze open_data_products/odpg/data/graph/graph.yaml
open-data-products odpg-agent-context open_data_products/odpg/data/graph/graph.yaml --node AGENT-AVIATION-001 --depth 2
open-data-products odpg-convert --input examples/graph.graphml --output /tmp/odp-converted-graph.yaml --json
open-data-products odpg-generate open_data_products/odpg/data/graph/graph.yaml --output /tmp/odp-graph-explorer.html --json

# Product-level Data Contract inspection
open-data-products product resolve-contracts examples/product.yaml --json
open-data-products product contract-schema examples/contract.yaml --json

See Data Contract workflows for product contract resolution, optional datacontract-cli integration, alignment checks, reports, and supported ODPS contract reference shapes. Live LLM generation requires Ollama or a configured provider API key; see LLM generation for runnable provider examples.

Spec-Specific Entry Points

  • open_data_products.generation: editable prompt templates and provider-backed generation helpers for ODPS, ODPC, and ODPG YAML artifacts. Defaults to local Ollama/Qwen 2.5 and can use copied config templates for external providers such as OpenAI.
  • open_data_products.odps: ODPS v4.1 models, standards-aware validation, YAML/JSON I/O, compliance helpers, and pricing_to_402.
  • open_data_products.odpc: ODPC catalog building, loading, validation, explanation, and object guidance search.
  • open_data_products.odpg: ODPG graph validation, summary, traversal, analysis, agent context, object search, external graph conversion, and graph explorer generation.
  • open_data_products.odpv: ODPV vocabulary loading, validation, search, and generated vocabulary artifacts.

Development

git clone https://github.com/Open-Data-Product-Initiative/odps-python
cd odps-python
pip install -e ".[dev]"
python examples/basic_usage.py

Dependencies

The library requires the following runtime packages:

  • PyYAML: YAML format support
  • jsonschema: ODPC and ODPG schema validation

Error Handling

The library provides detailed validation error messages that reference specific standards:

try:
    odp.validate()
except ODPSValidationError as e:
    print(e)
    # Output: "Validation errors: Invalid ISO 639-1 language code: 'xyz'; 
    #          dataHolder email must be a valid RFC 5322 email address"

Examples

ODPS v4.1 Example

See examples/odps_v41_example.py for a demonstration of key v4.1 features including:

  • ProductStrategy with business objectives
  • KPI definitions with targets and calculations
  • AI agent integration via MCP
  • Enhanced $ref support

Run the example:

python examples/odps_v41_example.py

Additional Examples

Generation Inputs And Outputs

See LLM generation for source documents, prompts, provider configuration, generated fragments, ODPG graph YAML, and graph explorer output.

Sample Apps

The examples/apps/ folder contains independent, runnable Python sample apps built on top of the SDK. Each app lives in its own folder with a cli.py entry point and can be run directly from the repository root.

  • ODP Document Inspector CLI: inspect any ODPS, ODPC, ODPG, or ODPV YAML/JSON document and print validation, explanation, references, and bundled resource metadata.
  • ODPV Vocabulary Finder CLI: search bundled ODPV terms by natural-language query and print definitions, scores, matched fields, and related terms.
  • ODPS Pricing 402 Builder CLI: build an HTTP 402 payment envelope from an ODPS product with pricing plans.
python examples/apps/document_inspector/cli.py examples/apps/pricing_402_builder/priced_product.yaml
python examples/apps/vocabulary_finder/cli.py "governance policy risk" --limit 5 --json
python examples/apps/pricing_402_builder/cli.py examples/apps/pricing_402_builder/priced_product.yaml --json

Acknowledgments

We extend our gratitude to the following:

Open Data Product Initiative Team - Special thanks to the team at opendataproducts.org for creating and maintaining the emerging Open Data Product standards family, including the Open Data Product Specification (ODPS), Open Data Product Catalog (ODPC), Open Data Product Graphs (ODPG), and Open Data Product Vocabulary (ODPV). Their vision of standardizing data product descriptions, catalogs, graphs, and shared vocabulary has made this SDK possible. These specifications represent years of collaborative effort from industry experts, data practitioners, and open source contributors who are driving the future of data standardization.

Chris Howard / Kitard - Special thanks to Chris Howard from Accenture for creating the original odps-python library. His foundational work made it possible to extend the project into the broader Open Data Products SDK and agent toolkit.

devlouie - Special thanks to devlouie for contributing the MCP layer and Agent Surface on top of the SDK, helping make the Open Data Products standards family easier to use from agentic tools and workflows.

Data Contract CLI - Special thanks to Stefan Negele, Jochen Christ, and Simon Harrer for creating Data Contract CLI, the open source execution engine this SDK can optionally use for external Data Contract validation, export, and ecosystem interoperability.

Python Community - For the exceptional ecosystem of libraries and tools that power this implementation, including PyYAML, jsonschema, and the countless other packages that make Python development a joy.

Data Community - For embracing open standards and driving the need for better data product specifications and tooling that benefits everyone in the data ecosystem.

Documentation Support - Documentation assistance provided by Claude (Anthropic).

Contributing

Contributions are welcome. Please read CONTRIBUTING.md for guidelines, browse the open issues, and consider helping with new features, bug fixes, examples, documentation, or agent-facing workflow improvements.

License

Apache License 2.0 - see LICENSE file for details.

Links & References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_data_products-0.1.3.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_data_products-0.1.3-py3-none-any.whl (243.4 kB view details)

Uploaded Python 3

File details

Details for the file open_data_products-0.1.3.tar.gz.

File metadata

  • Download URL: open_data_products-0.1.3.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for open_data_products-0.1.3.tar.gz
Algorithm Hash digest
SHA256 fe3de90bf22249f1666a77bc3b16c47f3fb3d164425491abd7b0a7b35119da5b
MD5 78b8f594947f5d0fb684652adc8077e1
BLAKE2b-256 98fa1d07ccf57878d3866392e27c47b95ba0c2091b36d8d14cc3e2e81722e18e

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_data_products-0.1.3.tar.gz:

Publisher: publish-pypi.yml on Open-Data-Product-Initiative/odp-agent-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_data_products-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for open_data_products-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a7132815baa21235835ed0fb53d011478d07ccd5ec8331cc717723efc59161cf
MD5 c77905e31fe2a97f79b4639bff76ff7f
BLAKE2b-256 020b737293d17c955eebbe3df85fa2dd3f081757eb64d29f47865d106ff56010

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_data_products-0.1.3-py3-none-any.whl:

Publisher: publish-pypi.yml on Open-Data-Product-Initiative/odp-agent-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page