Skip to main content

Extract Microsoft Fabric semantic model metadata. AI-ready exports for LangChain, OpenAI, Semantic Kernel, AutoGen, and custom plugins. Multi-provider LLM enrichment, MCP server, and cross-model governance.

Project description

fabric-ai-meta

CI Version Tests Python License

Extract, classify, and export metadata from Microsoft Fabric semantic models for AI frameworks.

Automates Prep for AI across 100+ models. No manual configuration. Exports to LangChain, OpenAI, Semantic Kernel, AutoGen. One extraction, every framework. Governs at workspace scale: naming inconsistencies, duplicate measures, readiness scores. Speaks MCP: six tools for any MCP-aware AI agent or IDE.

Install

pip install fabric-ai-meta

Optional extras: [llm] for multi-provider LLM enrichment, [mcp] for the MCP server, [xmla] for description writeback. Combine: pip install 'fabric-ai-meta[llm,mcp,xmla]'. For development, clone and pip install -e ".[dev]". Every release also attaches a wheel and a source distribution to its GitHub release page for airgapped installs.

Quickstart

fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --output ./output

Produces ai-ready-schema.json, readiness-score.json, measure-dependency-graph.json, and four framework exports in ./output/adventure-works/.

New to the tool? The end-to-end user guide walks every capability from install to writeback in plain language with persona-mapped workflow paths. The notebooks/quickstart.ipynb notebook gives the same tour inside a Fabric runtime.

Typical Workflows · The Problem · Who This Helps · Architecture · Usage · Output Files · LLM Enrichment · Library API · Plugins · Development


Typical Workflows

Different goals need different command sequences. Pick the one that matches you, or read the full user guide for the long version.

Solo BI developer exploring the tool
pip install fabric-ai-meta
fabric-ai-meta analyze "Adventure Works" --mock                    # try with bundled fixture
fabric-ai-meta analyze "Your Model" --workspace "Production"       # then point at real workspace
fabric-ai-meta export openai "Your Model" --workspace "Production" # export to your AI framework
Enterprise governance team
pip install fabric-ai-meta
# Write a .fabric-ai-meta.toml with [extraction] default_workspace and thresholds
fabric-ai-meta scan --workspace "Production" --output ./snapshot
fabric-ai-meta governance --workspace "Production" --report ./governance-report.json
# Wire scripts/ci-governance-check.py into PR pipeline (see docs/ci-cd-guide.md)
# Track over time by re-running scan with --baseline ./previous/workspace-summary.json
AI engineer building agents on Fabric data
pip install 'fabric-ai-meta[llm,mcp]'
export ANTHROPIC_API_KEY=sk-ant-...   # or any other supported provider
fabric-ai-meta analyze "Your Model" --workspace "Production" --llm-enrich
fabric-ai-meta export langchain "Your Model" --workspace "Production"
# Or expose live tools to your IDE via MCP:
fabric-ai-meta serve
Fabric architect cleaning a semantic model
pip install 'fabric-ai-meta[llm,xmla]'
fabric-ai-meta analyze "Sales Model" --workspace "Production" --llm-enrich
fabric-ai-meta export prep-for-ai "Sales Model" --workspace "Production" --llm-enrich
fabric-ai-meta apply-descriptions ./output/sales-model/prep-for-ai-config.json --mock           # preview
fabric-ai-meta apply-descriptions ./output/sales-model/prep-for-ai-config.json --no-dry-run     # commit (in a Fabric notebook)

The Problem: why this exists

Microsoft Fabric has invested heavily in AI features for semantic models: Prep for AI, Copilot-generated descriptions, Data Agents, and the emerging Fabric IQ Ontology. These are powerful, but they share three limitations:

1. They don't scale. Prep for AI requires manual configuration (selecting tables, writing AI Instructions, defining Verified Answers), one model at a time. An enterprise with 50 semantic models faces hundreds of hours of repetitive work. There is no bulk API.

2. They don't leave Fabric. Building a LangChain agent or an OpenAI function-calling pipeline against Fabric data? Microsoft offers no export path. Your AI application starts blind: no table types, no measure semantics, no relationship graph.

3. They don't govern across models. Copilot can describe a single measure, but it can't tell you that Total Sales in Model A and Sum of Sales in Model B are the same calculation with different names.

Gap What fabric-ai-meta does
Manual Prep for AI Auto-generates prep-for-ai-config.json: table selections, AI Instructions, Verified Answers, description backfill
Manual description writeback apply-descriptions writes generated table and column descriptions back through XMLA / TOM
No agent-callable surface fabric-ai-meta serve exposes six tools through MCP for any MCP-aware AI agent or IDE
No external AI export Produces framework-native schemas for LangChain, OpenAI, Semantic Kernel, and AutoGen
No way to add custom exporters Third parties ship exporters as installable Python plugins via the fabric_ai_meta.exporters entry point group
No cross-model governance Detects naming inconsistencies, duplicate DAX, ranks models by readiness, outputs governance report

This is not a replacement for Microsoft's tools. It is an automation layer on top of them and a bridge to the external AI ecosystem.

Who This Helps: three practitioner profiles

Fabric Architects / Senior BI Developers. You manage 10-100+ semantic models. You need Prep for AI configured, descriptions filled in, naming standards enforced, at scale, not one model at a time. fabric-ai-meta gives you bulk workspace scan, auto-generated Prep for AI configs, LLM-powered description backfill, and a governance report across your entire estate.

AI/ML Engineers Building on Fabric Data. You're building agents or RAG pipelines that query semantic models. You need structured metadata in your framework's native format. You don't have deep DAX expertise. fabric-ai-meta gives you one-command export to LangChain, OpenAI, Semantic Kernel, or AutoGen, plus an AI-ready schema with query guidance, pitfalls, and measure dependency graphs.

Data Governance Teams. You need visibility into documentation completeness, naming consistency, and model quality across the estate. fabric-ai-meta gives you a governance scorecard, automated naming violation detection, and AI readiness scores broken down by description coverage, naming consistency, and relationship completeness.

Philosophy: five principles
  1. Extract everything. Tables, columns, measures, DAX, relationships, hierarchies, descriptions, formatting rules, hidden-object flags.
  2. Classify automatically. Every table gets a type (fact, dimension, bridge). Every measure gets a category (additive, semi-additive, time intelligence). Heuristics first; LLM refines.
  3. Score honestly. AI Readiness Score (0.0-1.0) broken down by description coverage, naming consistency, relationship completeness, and business rule documentation. No vanity metrics.
  4. Export universally. One extraction produces LangChain, OpenAI, Semantic Kernel, AutoGen, and custom pipeline outputs.
  5. Govern at scale. Naming inconsistencies, duplicate measures, documentation gaps across an entire workspace in a single command.

Architecture

sempy.fabric requires the Microsoft Fabric notebook runtime and does not work locally. The tool operates in two modes detected automatically at startup:

flowchart TD
    A([CLI command]) --> B{Environment?}
    B -->|FABRIC_NOTEBOOK_ID set\nor notebookutils importable| C[Fabric Mode]
    B -->|Local machine| D{--mock flag?}
    D -->|Yes| E[Local/CI Mode]
    D -->|No| F[FabricEnvironmentError]

    C --> G[SemanticLinkExtractor\nAmbient credential]
    E --> H[MockExtractor\nFixture JSON files]

    G --> I[Core Engine]
    H --> I

    I --> J[Analyzer\nClassify · Score · Governance]
    J --> K[Generator\nSchemas · Exports · Reports]
Mode Where it runs Extractor Auth
Fabric mode Fabric notebook SemanticLinkExtractor Ambient (automatic)
Local/CI mode Any machine MockExtractor + fixture JSON None needed

Every command supports --mock: analyze, scan, export, score, governance, and apply-descriptions all work locally without a Fabric connection.


Usage

analyze: extract, classify, score, and export a single model
# Local dev with mock fixtures
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock

# With LLM enrichment (generates missing descriptions)
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --llm-enrich

# Specify output directory
fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --output ./output
scan: bulk scan all models in a workspace
fabric-ai-meta scan --workspace "Production" --mock --output ./output

Produces per-model output directories and a workspace-summary.json with score ranking.

Next steps: run governance for a cross-model report, or diff against a previous summary to track readiness over time.

export prep-for-ai: generate Prep for AI config
# Rule-based (no LLM)
fabric-ai-meta export prep-for-ai "Adventure Works" --workspace "Production" --mock

# With LLM-generated AI instructions
fabric-ai-meta export prep-for-ai "Adventure Works" --workspace "Production" --mock --llm-enrich

Output is a prep-for-ai-config.json you apply manually in Power BI Desktop or Fabric Service.

Next step: run apply-descriptions to push the generated description backfill straight to the live model through XMLA / TOM instead of pasting each one in the UI.

apply-descriptions: write generated descriptions back to a model
# Preview the writeback locally without contacting Fabric
fabric-ai-meta apply-descriptions ./output/adventure-works/prep-for-ai-config.json \
  --workspace "Production" --mock

# Inside a Fabric notebook, dry-run against the live model (default)
fabric-ai-meta apply-descriptions ./prep-for-ai-config.json --workspace "Production"

# Commit the changes through XMLA / TOM
fabric-ai-meta apply-descriptions ./prep-for-ai-config.json --workspace "Production" --no-dry-run

Reads the generated_descriptions section of a prep-for-ai-config.json and applies them to table and column descriptions through the Tabular Object Model. --mock runs locally without any service contact; without --mock, the command must run inside a Fabric notebook runtime.

Prerequisite: the input file is produced by export prep-for-ai --llm-enrich. Without --llm-enrich, the config will not contain a generated_descriptions section to apply.

governance: cross-model analysis and scorecard
fabric-ai-meta governance --workspace "Production" --mock --report ./governance-report.json

Detects naming inconsistencies, duplicate DAX expressions, and ranks models by AI readiness.

Next step: wire the report into CI/CD using scripts/ci-governance-check.py and the CI/CD guide.

score: AI readiness score for a model
fabric-ai-meta score "Adventure Works" --workspace "Production" --mock
export: framework-specific exports
fabric-ai-meta export langchain "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export openai "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export semantic-kernel "Adventure Works" --workspace "Production" --mock
fabric-ai-meta export autogen "Adventure Works" --workspace "Production" --mock

Add your own format: subclass BaseExporter and register a Python entry point under fabric_ai_meta.exporters. Your exporter then appears as fabric-ai-meta export <name> with the same flags. See the plugin development guide for a worked dbt example.

Tip: run with --llm-enrich on the upstream analyze step first to fill in missing descriptions before exporting.

serve: start the MCP server for AI agents
# Install the optional MCP extra
pip install 'fabric-ai-meta[mcp]'

# Start over stdio (default; what most MCP-aware AI agents and IDEs use)
fabric-ai-meta serve

# Start over streamable HTTP on a specific port
fabric-ai-meta serve --transport streamable-http --port 8000

Exposes six tools to AI agents: list_models, analyze_model, score_model, generate_schema, governance_report, diff_summaries. A ready-to-use .mcp.json lives at the project root, so any IDE that auto-discovers project-scoped MCP servers picks it up when the working directory is opened.

Connect from a desktop MCP client: add the contents of .mcp.json to your client's mcpServers configuration. Common locations include %APPDATA%/<client>/<client>_config.json on Windows and ~/Library/Application Support/<client>/<client>_config.json on macOS.

What this unlocks: ask your IDE agent "Which tables in our Sales Model are missing descriptions?" and get a real answer. The agent calls analyze_model and governance_report over MCP and returns the analysis without you running the CLI by hand.

diff: compare two workspace scans
# JSON output (default)
fabric-ai-meta diff baseline.json current.json

# Human-readable text
fabric-ai-meta diff baseline.json current.json --format text

# Save to file
fabric-ai-meta diff baseline.json current.json --output delta-report.json

Compares two workspace-summary.json files and reports: models added/removed, score changes, table/measure count changes, and per-model improvement or regression status.


Output Files

Per-model (written to {output}/{model-slug}/)

File Description
ai-ready-schema.json Full AI-ready schema: tables, measures, query guidance, scoring
langchain-tool.json LangChain tool definition
openai-function.json OpenAI function calling schema
semantic-kernel-plugin.json Semantic Kernel plugin manifest
autogen-tool.json AutoGen tool definition with full model context
prep-for-ai-config.json Prep for AI settings with step-by-step application guide
readiness-score.json AI readiness score and component breakdown
measure-dependency-graph.json DAX measure dependency graph
extraction-raw.json Raw extracted metadata

Workspace-level

File Description
workspace-summary.json Score ranking, recommendations, model inventory across all models
governance-report.json Cross-model naming issues, duplicate measures, governance scorecard

LLM Enrichment

Add --llm-enrich to any command to enable LLM-powered analysis:

export ANTHROPIC_API_KEY=sk-ant-...      # default provider; swap for another below

fabric-ai-meta analyze "Adventure Works" --workspace "Production" --mock --llm-enrich

Multi-provider support (via LiteLLM). Install the optional extra and pick any of 10+ providers in .fabric-ai-meta.toml:

pip install 'fabric-ai-meta[llm]'
Provider provider model example API key env var
Anthropic anthropic claude-sonnet-4-6 ANTHROPIC_API_KEY
OpenAI openai gpt-4o OPENAI_API_KEY
Google Gemini google gemini-2.5-pro GEMINI_API_KEY
xAI Grok xai grok-4 XAI_API_KEY
Mistral mistral mistral-large-latest MISTRAL_API_KEY
Cohere cohere command-r-plus COHERE_API_KEY
AWS Bedrock bedrock anthropic.claude-sonnet-4-v1:0 AWS_* (SDK default chain)
Azure OpenAI azure <deployment-name> AZURE_OPENAI_API_KEY
Google Vertex AI vertex gemini-2.5-pro GOOGLE_APPLICATION_CREDENTIALS
OpenAI-compatible openai-compatible <any> (set base_url) OPENAI_COMPATIBLE_API_KEY

The openai-compatible provider routes through any OpenAI-API-compatible host: Groq, Together, Fireworks, Ollama, LM Studio, vLLM, or a custom endpoint. Set base_url in config.

Example: OpenAI
[llm]
provider = "openai"
model = "gpt-4o"
api_key_env = "OPENAI_API_KEY"
Example: Azure OpenAI
[llm]
provider = "azure"
model = "gpt-4o-deployment"
api_key_env = "AZURE_OPENAI_API_KEY"
azure_endpoint = "https://my-resource.openai.azure.com"
azure_api_version = "2024-02-15-preview"
Example: Local Ollama (openai-compatible)
[llm]
provider = "openai-compatible"
model = "llama3.1"
base_url = "http://localhost:11434"
api_key_env = "OPENAI_COMPATIBLE_API_KEY"  # any non-empty value works

What LLM enrichment adds:

  • Missing table/column/measure descriptions (batch-generated, cached)
  • Natural-language grain detection for fact tables
  • AI Instructions text for Prep for AI configs

Cost controls (configure in .fabric-ai-meta.toml):

[llm]
max_cost_per_run = 0.20   # raises CostLimitExceededError if exceeded
cache_enabled = true       # SHA-256 keyed file cache, no TTL

Library API

All public functions are importable directly from the top-level package:

from fabric_ai_meta import (
    MockExtractor,
    SemanticModelMeta,
    score_model,
    generate_ai_ready_schema,
    to_openai_function,
    to_langchain_tool_definition,
    classify_table_heuristic,
)

# Load a model from fixture
extractor = MockExtractor(fixture_path="tests/fixtures/adventure_works.json")
model = extractor.extract("Adventure Works")

# Score it
score, breakdown = score_model(model)

# Generate exports
schema = generate_ai_ready_schema(model)
openai_fn = to_openai_function(model)

See fabric_ai_meta.__all__ for the full list of 35 public exports.


Plugins

Custom exporters install as ordinary Python packages and appear as fabric-ai-meta export <name> with the same --workspace and --mock flags as built-ins. No fork of fabric-ai-meta is required.

from fabric_ai_meta import BaseExporter, SemanticModelMeta


class DbtExporter(BaseExporter):
    name = "dbt"
    output_filename = "dbt-sources.yml"
    description = "dbt sources definition"

    def generate(self, model: SemanticModelMeta) -> dict:
        return {"version": 2, "sources": [...]}

Register the class via the fabric_ai_meta.exporters entry-point group in the plugin's pyproject.toml:

[project.entry-points."fabric_ai_meta.exporters"]
dbt = "my_fabric_dbt_plugin:DbtExporter"

Full walk-through with a worked dbt example, local testing, and name-conflict rules: docs/plugin-development.md.


Development

# Install dev dependencies
pip install -e ".[dev]"

# Run full test suite (400 tests, no Fabric runtime or real LLM calls required)
pytest tests/ -x -q

# Run with coverage
pytest tests/ --cov=fabric_ai_meta

All tests run locally. Fabric-dependent code is mocked via MockExtractor and fixture JSON files in tests/fixtures/.

Fabric Notebooks

Notebook Purpose
notebooks/quickstart.ipynb End-to-end walkthrough: authentication, model listing, analysis, export, governance
notebooks/tmdl-spike.ipynb Research spike: inspect getDefinition TMDL output for AI Instructions and Verified Answers (companion to docs/research/tmdl-prep-for-ai-spike.md)

CI/CD Integration

Wire fabric-ai-meta into GitHub Actions or Azure DevOps to enforce governance thresholds on every PR and track readiness trends on a schedule. The guide in docs/ci-cd-guide.md includes ready-to-paste workflow files and the standalone scripts/ci-governance-check.py threshold script.

Output Schemas

JSON Schema files for all output formats are in schemas/:

Schema Validates
schemas/v1.json ai-ready-schema.json output
schemas/workspace-summary/v1.json workspace-summary.json output
schemas/governance-report/v1.json governance-report.json output
schemas/prep-for-ai/v1.json prep-for-ai-config.json output

Environment Variables

Variable Required Description
ANTHROPIC_API_KEY For --llm-enrich with default anthropic provider LLM API key
OPENAI_API_KEY / GEMINI_API_KEY / XAI_API_KEY / MISTRAL_API_KEY / COHERE_API_KEY / AZURE_OPENAI_API_KEY / OPENAI_COMPATIBLE_API_KEY For the matching provider in [llm] config See provider matrix above
AWS_* (Bedrock), GOOGLE_APPLICATION_CREDENTIALS (Vertex) When using Bedrock or Vertex Standard SDK credential chains
FABRIC_NOTEBOOK_ID Auto-set in Fabric Signals Fabric notebook runtime

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fabric_ai_meta-1.3.3.tar.gz (105.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fabric_ai_meta-1.3.3-py3-none-any.whl (75.0 kB view details)

Uploaded Python 3

File details

Details for the file fabric_ai_meta-1.3.3.tar.gz.

File metadata

  • Download URL: fabric_ai_meta-1.3.3.tar.gz
  • Upload date:
  • Size: 105.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fabric_ai_meta-1.3.3.tar.gz
Algorithm Hash digest
SHA256 aaca4df25fc12dbf668589864e93d2e719833d8145eb6ebe20b284a57208ba45
MD5 1f020df9ac286cc0c52e7c1f7e2ba65c
BLAKE2b-256 5567291f7c1673bd6bf3f7ae22542630a1c226d32a2549b8d0ba9a7a4081559a

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabric_ai_meta-1.3.3.tar.gz:

Publisher: publish.yml on psistla/fabric-ai-meta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fabric_ai_meta-1.3.3-py3-none-any.whl.

File metadata

  • Download URL: fabric_ai_meta-1.3.3-py3-none-any.whl
  • Upload date:
  • Size: 75.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fabric_ai_meta-1.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3f7fc0bc5f9da1acf6866364707588f93e4129a75bf76087afa1476819d3346e
MD5 fb37f789c51100ec5c02a50f534ce60e
BLAKE2b-256 ea0b32567a19587e5012af4ec78393bde51fea99972add9ce4a8c472375181de

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabric_ai_meta-1.3.3-py3-none-any.whl:

Publisher: publish.yml on psistla/fabric-ai-meta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page