Local-first AI investigation CLI for OpenMetadata data pipelines.
Project description
OpenBlame
OpenBlame is a local-first CLI investigation agent for data pipelines running on OpenMetadata. Point it to a table, and it will trace lineage, inspect recent quality failures, parse schema-change events, surface governance gaps, and collect owner metadata to build a concrete incident narrative.
The reasoning layer runs on a local Ollama model, so investigations stay inside your environment. This makes OpenBlame useful for secure internal datasets and fast incident response workflows where you want reproducible, metadata-driven triage without external LLM APIs. The project behaves like "git blame for your data pipeline," but with lineage, observability, and governance context in one investigation loop.
Why It Stands Out
- Autonomous investigation loop: plan, gather metadata, reason, and draft an incident report
- Native OpenMetadata story: lineage, quality, schema history, owners, tags, domain, and tier
- Governance-aware triage: missing owner, missing tier, missing tags, and missing description are surfaced as explicit operational risks
- Local-first by default: Ollama only, no external LLM API required
- Demo-ready outputs: Rich terminal UX, markdown incident report, and MCP server wrapper
Architecture
+--------------------+
| openblame CLI |
| Typer + Rich UX |
+---------+----------+
|
v
+--------------------+
| OpenBlame Agent |
| ReAct-style loop |
+----+----------+----+
| |
+-------------+ +------------------+
v v
+---------------------------+ +---------------------------+
| OpenMetadata REST tools | | Local Ollama Reasoner |
| lineage/quality/diff/owner| | plan() + reason() |
+-------------+-------------+ +-------------+-------------+
| |
v v
+---------------------+ +---------------------+
| Structured evidence |-------------------->| Markdown report |
+---------------------+ +---------------------+
Prerequisites
- Python 3.11+
- OpenMetadata instance reachable from your machine
- OpenMetadata JWT token with read access
- Ollama installed locally and running (
ollama serve) - An installed local model such as
llama3
Installation
pip install openblame
For local development:
pip install -e ".[dev]"
Run tests with plugin autoload disabled so unrelated third-party pytest plugins from
openmetadata-ingestion do not interfere:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python -m pytest tests/ -v -p pytest_asyncio.plugin
Quick Start
- Copy
.env.exampleto.envand set credentials. - Run:
openblame investigate default.public.orders --depth 3 --days 7
Example output:
[CRITICAL] default.public.orders
Root Cause: `order_total` type changed from DECIMAL to STRING without downstream migration.
Impact: 6 downstream tables plus BI dashboard refresh failures.
Owner: Data Platform (data-platform@company.com)
Suggested Fix: Restore compatible type, backfill, rerun failed checks.
Demo Flow
The strongest demo is a single broken metric traced end to end:
- Pick a table with a recent schema drift or quality failure.
- Run
openblame investigate <table_fqn>. - Show the agent plan, anomaly panels, governance risk briefing, and final incident report.
- Highlight the downstream blast radius and owner handoff.
- End by showing the MCP server or generated GitHub issue payload.
CLI Commands
Investigate
openblame investigate <table_fqn> --depth 3 --days 7 --output report.md --model llama3
Runs the full investigation loop, prints a Rich report, optionally writes markdown, and can suggest a GitHub issue payload.
Schema Diff
openblame diff default.public.orders --days 7
Prints table schema changes over the lookback window.
Lineage
openblame lineage default.public.orders --depth 3 --direction both
Renders upstream and downstream lineage as a Rich tree.
MCP Server
openblame mcp-server
Starts OpenBlame as an MCP stdio server exposing investigation tools.
Configuration
OpenBlame reads .env and environment variables:
OPENMETADATA_HOST=http://localhost:8585
OPENMETADATA_JWT_TOKEN=<token>
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama3
MCP Server Setup
The server exposes:
investigate_table({ table_fqn, depth, days })get_lineage({ table_fqn, depth, direction })get_schema_diff({ table_fqn, days })
Use stdio transport with:
openblame mcp-server
How It Works
- Fetch baseline metadata such as owners and schema snapshot.
- Ask Ollama to produce an investigation plan.
- Execute OpenMetadata tools in parallel across lineage, quality, schema history, and ownership.
- Convert raw metadata into evidence, anomalies, governance risks, and downstream blast radius summaries.
- Send gathered evidence back to Ollama for incident reasoning.
- Render and optionally persist a markdown report.
Tool failures are non-fatal. OpenBlame continues with partial data whenever possible.
Publishing
This repo is set up for GitHub Actions CI and can be wired for PyPI Trusted Publishing. Once the PyPI package-name issue is resolved, publishing can be automated from GitHub releases.
Hackathon Context
Built for the WeMakeDevs x OpenMetadata hackathon as an AI-powered metadata investigator focused on local-first reasoning, blast-radius analysis, and governance-aware incident response workflows.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openblame-0.1.0.tar.gz.
File metadata
- Download URL: openblame-0.1.0.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfb3213b806f3fea78ebe80ee317a7f6be39d40ef8c1edf14d133a29cb13a161
|
|
| MD5 |
b65f39a6edaa8645c9e71043540cf5f3
|
|
| BLAKE2b-256 |
fb71d7381c91d2f300651cb4bc723c789783a9270a998a6cb733096b7158556c
|
Provenance
The following attestation bundles were made for openblame-0.1.0.tar.gz:
Publisher:
publish.yml on manasdutta04/openblame
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openblame-0.1.0.tar.gz -
Subject digest:
dfb3213b806f3fea78ebe80ee317a7f6be39d40ef8c1edf14d133a29cb13a161 - Sigstore transparency entry: 1340303123
- Sigstore integration time:
-
Permalink:
manasdutta04/openblame@e41dd27e69ef25bf5d8b1c43680137a4a97fc5cc -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/manasdutta04
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e41dd27e69ef25bf5d8b1c43680137a4a97fc5cc -
Trigger Event:
release
-
Statement type:
File details
Details for the file openblame-0.1.0-py3-none-any.whl.
File metadata
- Download URL: openblame-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14d292f40ead65a4347fae1f646ab1ee1f2d0e1858d6ecc0c602fbcf946dfae3
|
|
| MD5 |
f6078868253b892964bd855fe93a00dc
|
|
| BLAKE2b-256 |
a9776b716faaa95c17ea10a6bcd9f6ce07f82138301e584a041066b665e596f9
|
Provenance
The following attestation bundles were made for openblame-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on manasdutta04/openblame
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openblame-0.1.0-py3-none-any.whl -
Subject digest:
14d292f40ead65a4347fae1f646ab1ee1f2d0e1858d6ecc0c602fbcf946dfae3 - Sigstore transparency entry: 1340303136
- Sigstore integration time:
-
Permalink:
manasdutta04/openblame@e41dd27e69ef25bf5d8b1c43680137a4a97fc5cc -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/manasdutta04
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e41dd27e69ef25bf5d8b1c43680137a4a97fc5cc -
Trigger Event:
release
-
Statement type: