Extract Microsoft Fabric semantic model metadata. AI-ready exports for LangChain, OpenAI, Semantic Kernel, AutoGen, and custom plugins. Multi-provider LLM enrichment, MCP server, and cross-model governance.
Project description
fabric-ai-meta
Extract, classify, and export metadata from Microsoft Fabric semantic models for AI frameworks.
Automates Prep for AI across 100+ models. No manual configuration. Exports to LangChain, OpenAI, Semantic Kernel, AutoGen. One extraction, every framework. Governs at workspace scale: naming inconsistencies, duplicate measures, readiness scores. Speaks MCP: six tools for any MCP-aware AI agent or IDE.
Install
pip install fabric-ai-meta
Optional extras: [llm] for multi-provider LLM enrichment, [mcp] for the MCP server, [xmla] for description writeback. Combine: pip install 'fabric-ai-meta[llm,mcp,xmla]'. For development, clone and pip install -e ".[dev]". Every release also attaches a wheel and a source distribution to its GitHub release page for airgapped installs.
Quickstart
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --output ./output
Produces ai-ready-schema.json, readiness-score.json, measure-dependency-graph.json, and four framework exports in ./output/adventure-works/.
New to the tool? The end-to-end user guide walks every capability from install to writeback in plain language with persona-mapped workflow paths. The
notebooks/quickstart.ipynbnotebook gives the same tour inside a Fabric runtime.
Typical Workflows · The Problem · Who This Helps · Architecture · Usage · Output Files · LLM Enrichment · Library API · Plugins · Development
Typical Workflows
Different goals need different command sequences. Pick the one that matches you, or read the full user guide for the long version.
Solo BI developer exploring the tool
pip install fabric-ai-meta
fabric-ai-meta analyze "Adventure Works" --mock # try with bundled fixture
fabric-ai-meta analyze "Your Model" --workspace "Production" # then point at real workspace
fabric-ai-meta export openai "Your Model" --workspace "Production" # export to your AI framework
Enterprise governance team
pip install fabric-ai-meta
# Write a .fabric-ai-meta.toml with [extraction] default_workspace and thresholds
fabric-ai-meta scan --workspace "Production" --output ./snapshot
fabric-ai-meta governance --workspace "Production" --report ./governance-report.json
# Wire scripts/ci-governance-check.py into PR pipeline (see docs/ci-cd-guide.md)
# Track over time by re-running scan with --baseline ./previous/workspace-summary.json
AI engineer building agents on Fabric data
pip install 'fabric-ai-meta[llm,mcp]'
export ANTHROPIC_API_KEY=sk-ant-... # or any other supported provider
fabric-ai-meta analyze "Your Model" --workspace "Production" --llm-enrich
fabric-ai-meta export langchain "Your Model" --workspace "Production"
# Or expose live tools to your IDE via MCP:
fabric-ai-meta serve
Fabric architect cleaning a semantic model
pip install 'fabric-ai-meta[llm,xmla]'
fabric-ai-meta analyze "Sales Model" --workspace "Production" --llm-enrich
fabric-ai-meta export prep-for-ai "Sales Model" --workspace "Production" --llm-enrich
fabric-ai-meta apply-descriptions ./output/sales-model/prep-for-ai-config.json --mock # preview
fabric-ai-meta apply-descriptions ./output/sales-model/prep-for-ai-config.json --no-dry-run # commit (in a Fabric notebook)
The Problem: why this exists
Microsoft Fabric has invested heavily in AI features for semantic models: Prep for AI, Copilot-generated descriptions, Data Agents, and the emerging Fabric IQ Ontology. These are powerful, but they share three limitations:
1. They don't scale. Prep for AI requires manual configuration (selecting tables, writing AI Instructions, defining Verified Answers), one model at a time. An enterprise with 50 semantic models faces hundreds of hours of repetitive work. There is no bulk API.
2. They don't leave Fabric. Building a LangChain agent or an OpenAI function-calling pipeline against Fabric data? Microsoft offers no export path. Your AI application starts blind: no table types, no measure semantics, no relationship graph.
3. They don't govern across models. Copilot can describe a single measure, but it can't tell you that Total Sales in Model A and Sum of Sales in Model B are the same calculation with different names.
| Gap | What fabric-ai-meta does |
|---|---|
| Manual Prep for AI | Auto-generates prep-for-ai-config.json: table selections, AI Instructions, Verified Answers, description backfill |
| Manual description writeback | apply-descriptions writes generated table and column descriptions back through XMLA / TOM |
| No agent-callable surface | fabric-ai-meta serve exposes six tools through MCP for any MCP-aware AI agent or IDE |
| No external AI export | Produces framework-native schemas for LangChain, OpenAI, Semantic Kernel, and AutoGen |
| No way to add custom exporters | Third parties ship exporters as installable Python plugins via the fabric_ai_meta.exporters entry point group |
| No cross-model governance | Detects naming inconsistencies, duplicate DAX, ranks models by readiness, outputs governance report |
This is not a replacement for Microsoft's tools. It is an automation layer on top of them and a bridge to the external AI ecosystem.
Who This Helps: three practitioner profiles
Fabric Architects / Senior BI Developers. You manage 10-100+ semantic models. You need Prep for AI configured, descriptions filled in, naming standards enforced, at scale, not one model at a time. fabric-ai-meta gives you bulk workspace scan, auto-generated Prep for AI configs, LLM-powered description backfill, and a governance report across your entire estate.
AI/ML Engineers Building on Fabric Data. You're building agents or RAG pipelines that query semantic models. You need structured metadata in your framework's native format. You don't have deep DAX expertise. fabric-ai-meta gives you one-command export to LangChain, OpenAI, Semantic Kernel, or AutoGen, plus an AI-ready schema with query guidance, pitfalls, and measure dependency graphs.
Data Governance Teams. You need visibility into documentation completeness, naming consistency, and model quality across the estate. fabric-ai-meta gives you a governance scorecard, automated naming violation detection, and AI readiness scores broken down by description coverage, naming consistency, and relationship completeness.
Philosophy: five principles
- Extract everything. Tables, columns, measures, DAX, relationships, hierarchies, descriptions, formatting rules, hidden-object flags.
- Classify automatically. Every table gets a type (fact, dimension, bridge). Every measure gets a category (additive, semi-additive, time intelligence). Heuristics first; LLM refines.
- Score honestly. AI Readiness Score (0.0-1.0) broken down by description coverage, naming consistency, relationship completeness, and business rule documentation. No vanity metrics.
- Export universally. One extraction produces LangChain, OpenAI, Semantic Kernel, AutoGen, and custom pipeline outputs.
- Govern at scale. Naming inconsistencies, duplicate measures, documentation gaps across an entire workspace in a single command.
Architecture
sempy.fabric requires the Microsoft Fabric notebook runtime and does not work locally.
The tool operates in two modes detected automatically at startup:
flowchart TD
A([CLI command]) --> B{Environment?}
B -->|FABRIC_NOTEBOOK_ID set\nor notebookutils importable| C[Fabric Mode]
B -->|Local machine| D{--mock flag?}
D -->|Yes| E[Local/CI Mode]
D -->|No| F[FabricEnvironmentError]
C --> G[SemanticLinkExtractor\nAmbient credential]
E --> H[MockExtractor\nFixture JSON files]
G --> I[Core Engine]
H --> I
I --> J[Analyzer\nClassify · Score · Governance]
J --> K[Generator\nSchemas · Exports · Reports]
| Mode | Where it runs | Extractor | Auth |
|---|---|---|---|
| Fabric mode | Fabric notebook | SemanticLinkExtractor |
Ambient (automatic) |
| Local/CI mode | Any machine | MockExtractor + fixture JSON |
None needed |
Every command supports
--mock:analyze,scan,export,score,governance, andapply-descriptionsall work locally without a Fabric connection.
Usage
analyze: extract, classify, score, and export a single model
# Local dev with mock fixtures
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock
# With LLM enrichment (generates missing descriptions)
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --llm-enrich
# Specify output directory
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --output ./output
scan: bulk scan all models in a workspace
fabric-ai-meta scan --workspace "Production" --mock --output ./output
Produces per-model output directories and a workspace-summary.json with score ranking.
Next steps: run governance for a cross-model report, or diff against a previous summary to track readiness over time.
export prep-for-ai: generate Prep for AI config
# Rule-based (no LLM)
fabric-ai-meta export prep-for-ai "Adventure Works" --workspace "Production" --mock
# With LLM-generated AI instructions
fabric-ai-meta export prep-for-ai "Adventure Works" --workspace "Production" --mock --llm-enrich
Output is a prep-for-ai-config.json you apply manually in Power BI Desktop or Fabric Service.
Next step: run apply-descriptions to push the generated description backfill straight to the live model through XMLA / TOM instead of pasting each one in the UI.
apply-descriptions: write generated descriptions back to a model
# Preview the writeback locally without contacting Fabric
fabric-ai-meta apply-descriptions ./output/adventure-works/prep-for-ai-config.json \
--workspace "Production" --mock
# Inside a Fabric notebook, dry-run against the live model (default)
fabric-ai-meta apply-descriptions ./prep-for-ai-config.json --workspace "Production"
# Commit the changes through XMLA / TOM
fabric-ai-meta apply-descriptions ./prep-for-ai-config.json --workspace "Production" --no-dry-run
Reads the generated_descriptions section of a prep-for-ai-config.json and applies them to table and column descriptions through the Tabular Object Model. --mock runs locally without any service contact; without --mock, the command must run inside a Fabric notebook runtime.
Prerequisite: the input file is produced by export prep-for-ai --llm-enrich. Without --llm-enrich, the config will not contain a generated_descriptions section to apply.
governance: cross-model analysis and scorecard
fabric-ai-meta governance --workspace "Production" --mock --report ./governance-report.json
Detects naming inconsistencies, duplicate DAX expressions, and ranks models by AI readiness.
Next step: wire the report into CI/CD using scripts/ci-governance-check.py and the CI/CD guide.
score: AI readiness score for a model
fabric-ai-meta score "Adventure Works" --workspace "Production" --mock
export: framework-specific exports
fabric-ai-meta export langchain "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export openai "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export semantic-kernel "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export autogen "Adventure Works" --workspace "Production" --mock
Add your own format: subclass BaseExporter and register a Python entry point under fabric_ai_meta.exporters. Your exporter then appears as fabric-ai-meta export <name> with the same flags. See the plugin development guide for a worked dbt example.
Tip: run with --llm-enrich on the upstream analyze step first to fill in missing descriptions before exporting.
serve: start the MCP server for AI agents
# Install the optional MCP extra
pip install 'fabric-ai-meta[mcp]'
# Start over stdio (default; what most MCP-aware AI agents and IDEs use)
fabric-ai-meta serve
# Start over streamable HTTP on a specific port
fabric-ai-meta serve --transport streamable-http --port 8000
Exposes six tools to AI agents: list_models, analyze_model, score_model, generate_schema, governance_report, diff_summaries. A ready-to-use .mcp.json lives at the project root, so any IDE that auto-discovers project-scoped MCP servers picks it up when the working directory is opened.
Connect from a desktop MCP client: add the contents of .mcp.json to your client's mcpServers configuration. Common locations include %APPDATA%/<client>/<client>_config.json on Windows and ~/Library/Application Support/<client>/<client>_config.json on macOS.
What this unlocks: ask your IDE agent "Which tables in our Sales Model are missing descriptions?" and get a real answer. The agent calls analyze_model and governance_report over MCP and returns the analysis without you running the CLI by hand.
diff: compare two workspace scans
# JSON output (default)
fabric-ai-meta diff baseline.json current.json
# Human-readable text
fabric-ai-meta diff baseline.json current.json --format text
# Save to file
fabric-ai-meta diff baseline.json current.json --output delta-report.json
Compares two workspace-summary.json files and reports: models added/removed, score changes, table/measure count changes, and per-model improvement or regression status.
Output Files
Per-model (written to {output}/{model-slug}/)
| File | Description |
|---|---|
ai-ready-schema.json |
Full AI-ready schema: tables, measures, query guidance, scoring |
langchain-tool.json |
LangChain tool definition |
openai-function.json |
OpenAI function calling schema |
semantic-kernel-plugin.json |
Semantic Kernel plugin manifest |
autogen-tool.json |
AutoGen tool definition with full model context |
prep-for-ai-config.json |
Prep for AI settings with step-by-step application guide |
readiness-score.json |
AI readiness score and component breakdown |
measure-dependency-graph.json |
DAX measure dependency graph |
extraction-raw.json |
Raw extracted metadata |
Workspace-level
| File | Description |
|---|---|
workspace-summary.json |
Score ranking, recommendations, model inventory across all models |
governance-report.json |
Cross-model naming issues, duplicate measures, governance scorecard |
LLM Enrichment
Add --llm-enrich to any command to enable LLM-powered analysis:
export ANTHROPIC_API_KEY=sk-ant-... # default provider; swap for another below
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --llm-enrich
Multi-provider support (via LiteLLM). Install the optional extra and pick any of 10+ providers in .fabric-ai-meta.toml:
pip install 'fabric-ai-meta[llm]'
| Provider | provider |
model example |
API key env var |
|---|---|---|---|
| Anthropic | anthropic |
claude-sonnet-4-6 |
ANTHROPIC_API_KEY |
| OpenAI | openai |
gpt-4o |
OPENAI_API_KEY |
| Google Gemini | google |
gemini-2.5-pro |
GEMINI_API_KEY |
| xAI Grok | xai |
grok-4 |
XAI_API_KEY |
| Mistral | mistral |
mistral-large-latest |
MISTRAL_API_KEY |
| Cohere | cohere |
command-r-plus |
COHERE_API_KEY |
| AWS Bedrock | bedrock |
anthropic.claude-sonnet-4-v1:0 |
AWS_* (SDK default chain) |
| Azure OpenAI | azure |
<deployment-name> |
AZURE_OPENAI_API_KEY |
| Google Vertex AI | vertex |
gemini-2.5-pro |
GOOGLE_APPLICATION_CREDENTIALS |
| OpenAI-compatible | openai-compatible |
<any> (set base_url) |
OPENAI_COMPATIBLE_API_KEY |
The openai-compatible provider routes through any OpenAI-API-compatible host: Groq, Together, Fireworks, Ollama, LM Studio, vLLM, or a custom endpoint. Set base_url in config.
Example: OpenAI
[llm]
provider = "openai"
model = "gpt-4o"
api_key_env = "OPENAI_API_KEY"
Example: Azure OpenAI
[llm]
provider = "azure"
model = "gpt-4o-deployment"
api_key_env = "AZURE_OPENAI_API_KEY"
azure_endpoint = "https://my-resource.openai.azure.com"
azure_api_version = "2024-02-15-preview"
Example: Local Ollama (openai-compatible)
[llm]
provider = "openai-compatible"
model = "llama3.1"
base_url = "http://localhost:11434"
api_key_env = "OPENAI_COMPATIBLE_API_KEY" # any non-empty value works
What LLM enrichment adds:
- Missing table/column/measure descriptions (batch-generated, cached)
- Natural-language grain detection for fact tables
- AI Instructions text for Prep for AI configs
Cost controls (configure in .fabric-ai-meta.toml):
[llm]
max_cost_per_run = 0.20 # raises CostLimitExceededError if exceeded
cache_enabled = true # SHA-256 keyed file cache, no TTL
Library API
All public functions are importable directly from the top-level package:
from fabric_ai_meta import (
MockExtractor,
SemanticModelMeta,
score_model,
generate_ai_ready_schema,
to_openai_function,
to_langchain_tool_definition,
classify_table_heuristic,
)
# Load a model from fixture
extractor = MockExtractor(fixture_path="tests/fixtures/adventure_works.json")
model = extractor.extract("Adventure Works")
# Score it
score, breakdown = score_model(model)
# Generate exports
schema = generate_ai_ready_schema(model)
openai_fn = to_openai_function(model)
See fabric_ai_meta.__all__ for the full list of 35 public exports.
Plugins
Custom exporters install as ordinary Python packages and appear as fabric-ai-meta export <name> with the same --workspace and --mock flags as built-ins. No fork of fabric-ai-meta is required.
from fabric_ai_meta import BaseExporter, SemanticModelMeta
class DbtExporter(BaseExporter):
name = "dbt"
output_filename = "dbt-sources.yml"
description = "dbt sources definition"
def generate(self, model: SemanticModelMeta) -> dict:
return {"version": 2, "sources": [...]}
Register the class via the fabric_ai_meta.exporters entry-point group in the plugin's pyproject.toml:
[project.entry-points."fabric_ai_meta.exporters"]
dbt = "my_fabric_dbt_plugin:DbtExporter"
Full walk-through with a worked dbt example, local testing, and name-conflict rules: docs/plugin-development.md.
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run full test suite (400 tests, no Fabric runtime or real LLM calls required)
pytest tests/ -x -q
# Run with coverage
pytest tests/ --cov=fabric_ai_meta
All tests run locally. Fabric-dependent code is mocked via MockExtractor and fixture JSON files in tests/fixtures/.
Fabric Notebooks
| Notebook | Purpose |
|---|---|
notebooks/quickstart.ipynb |
End-to-end walkthrough: authentication, model listing, analysis, export, governance |
notebooks/tmdl-spike.ipynb |
Research spike: inspect getDefinition TMDL output for AI Instructions and Verified Answers (companion to docs/research/tmdl-prep-for-ai-spike.md) |
CI/CD Integration
Wire fabric-ai-meta into GitHub Actions or Azure DevOps to enforce governance thresholds on every PR and track readiness trends on a schedule. The guide in docs/ci-cd-guide.md includes ready-to-paste workflow files and the standalone scripts/ci-governance-check.py threshold script.
Output Schemas
JSON Schema files for all output formats are in schemas/:
| Schema | Validates |
|---|---|
schemas/v1.json |
ai-ready-schema.json output |
schemas/workspace-summary/v1.json |
workspace-summary.json output |
schemas/governance-report/v1.json |
governance-report.json output |
schemas/prep-for-ai/v1.json |
prep-for-ai-config.json output |
Environment Variables
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_API_KEY |
For --llm-enrich with default anthropic provider |
LLM API key |
OPENAI_API_KEY / GEMINI_API_KEY / XAI_API_KEY / MISTRAL_API_KEY / COHERE_API_KEY / AZURE_OPENAI_API_KEY / OPENAI_COMPATIBLE_API_KEY |
For the matching provider in [llm] config |
See provider matrix above |
AWS_* (Bedrock), GOOGLE_APPLICATION_CREDENTIALS (Vertex) |
When using Bedrock or Vertex | Standard SDK credential chains |
FABRIC_NOTEBOOK_ID |
Auto-set in Fabric | Signals Fabric notebook runtime |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fabric_ai_meta-1.3.3.tar.gz.
File metadata
- Download URL: fabric_ai_meta-1.3.3.tar.gz
- Upload date:
- Size: 105.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaca4df25fc12dbf668589864e93d2e719833d8145eb6ebe20b284a57208ba45
|
|
| MD5 |
1f020df9ac286cc0c52e7c1f7e2ba65c
|
|
| BLAKE2b-256 |
5567291f7c1673bd6bf3f7ae22542630a1c226d32a2549b8d0ba9a7a4081559a
|
Provenance
The following attestation bundles were made for fabric_ai_meta-1.3.3.tar.gz:
Publisher:
publish.yml on psistla/fabric-ai-meta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fabric_ai_meta-1.3.3.tar.gz -
Subject digest:
aaca4df25fc12dbf668589864e93d2e719833d8145eb6ebe20b284a57208ba45 - Sigstore transparency entry: 1502485616
- Sigstore integration time:
-
Permalink:
psistla/fabric-ai-meta@0f34c970d6dac98492b76ee0d7f76b868c8be35b -
Branch / Tag:
refs/tags/v1.3.3 - Owner: https://github.com/psistla
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0f34c970d6dac98492b76ee0d7f76b868c8be35b -
Trigger Event:
push
-
Statement type:
File details
Details for the file fabric_ai_meta-1.3.3-py3-none-any.whl.
File metadata
- Download URL: fabric_ai_meta-1.3.3-py3-none-any.whl
- Upload date:
- Size: 75.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f7fc0bc5f9da1acf6866364707588f93e4129a75bf76087afa1476819d3346e
|
|
| MD5 |
fb37f789c51100ec5c02a50f534ce60e
|
|
| BLAKE2b-256 |
ea0b32567a19587e5012af4ec78393bde51fea99972add9ce4a8c472375181de
|
Provenance
The following attestation bundles were made for fabric_ai_meta-1.3.3-py3-none-any.whl:
Publisher:
publish.yml on psistla/fabric-ai-meta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fabric_ai_meta-1.3.3-py3-none-any.whl -
Subject digest:
3f7fc0bc5f9da1acf6866364707588f93e4129a75bf76087afa1476819d3346e - Sigstore transparency entry: 1502485737
- Sigstore integration time:
-
Permalink:
psistla/fabric-ai-meta@0f34c970d6dac98492b76ee0d7f76b868c8be35b -
Branch / Tag:
refs/tags/v1.3.3 - Owner: https://github.com/psistla
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0f34c970d6dac98492b76ee0d7f76b868c8be35b -
Trigger Event:
push
-
Statement type: