Skip to main content

UDI Agent: LLM-powered data visualization orchestration library

Project description

UDIAgent

LLM-powered data visualization orchestration library for the Universal Discovery Interface (UDI).

UDIAgent orchestrates LLM calls to generate data visualization specs from natural language queries. It can be used as a standalone Python library or deployed as a FastAPI microservice.

Installation

# Core library only
pip install udiagent

# With the reference FastAPI server
pip install udiagent[server]

# With LangFuse observability
pip install udiagent[langfuse]

# With benchmarking tools
pip install udiagent[benchmark]

# Everything
pip install udiagent[all]

For local development with uv:

uv sync --extra server --extra langfuse --extra test   # server + dev

Library Usage

from udiagent import UDIAgent, Orchestrator

# Initialize the agent with explicit configuration (no environment variables)
agent = UDIAgent(
    gpt_model_name="gpt-5.4",
    openai_api_key="sk-...",
)

# Create an orchestrator
orch = Orchestrator(agent)

# Run a query
result = orch.run(
    messages=[{"role": "user", "content": "Show me a bar chart of donors by sex"}],
    data_schema='{"resources": [...]}',
    data_domains='[{"entity": "donors", "field": "sex", ...}]',
)

# result.tool_calls — list of tool call dicts (e.g. RenderVisualization, FilterData)
# result.orchestrator_choice — "render-visualization", "both", "explain", etc.

Key Classes

Class Description
UDIAgent OpenAI client wrapper
Orchestrator Routes user requests to visualization, filter, explanation, and clarification handlers
OrchestratorResult Dataclass with tool_calls and orchestrator_choice

Utility Functions

Function Description
load_grammar() Load the UDI Grammar JSON schema (bundled with the package)
load_skills() Load skill prompt templates (bundled with the package)
render_template() Substitute {{key}} placeholders in a skill instruction template
generate_vis_spec() Generate a visualization spec using the skills pipeline
simplify_data_domains() Simplify data domains JSON into compact LLM-friendly text
parse_schema_from_dict() Parse a data schema dict into structured format

Server Usage

The udiagent.server subpackage provides a reference FastAPI application that wraps the library as a configurable microservice. It reads configuration from environment variables.

Running the Server for Local Development

# Development
uv run fastapi dev src/udiagent/server/app.py --port 8007

# Production
uv run fastapi run src/udiagent/server/app.py --port 8007

Server Environment Variables

Variable Required Default Description
OPENAI_API_KEY No OpenAI API key. If not set, must be provided per-request via X-OpenAI-Key header.
GPT_MODEL_NAME No gpt-5.4 OpenAI model for orchestration
JWT_SECRET_KEY Yes* JWT signing key (*not required if INSECURE_DEV_MODE=1)
JWT_ALGORITHM No HS256 JWT algorithm
INSECURE_DEV_MODE No 0 Set to 1 to skip JWT verification (development only)
LANGFUSE_SECRET_KEY No LangFuse observability secret key
LANGFUSE_PUBLIC_KEY No LangFuse observability public key
LANGFUSE_BASE_URL No LangFuse instance URL

Server Endpoints

Endpoint Method Description
/ GET API status and info
/v1/yac/completions POST Main orchestrator — routes user requests to tools
/v1/yac/benchmark POST Benchmark variant with optional orchestrator override
/v1/yac/examples GET Example prompts from data/example_prompts.json
/v1/yac/structured_functions GET Structured function registry
/v1/yac/benchmark_analysis GET Latest benchmark analysis results

Docker

docker build -t udiagent .
docker run -p 80:80 --env-file .env udiagent

Architecture

Orchestration Flow

User query
  → Orchestrator.run()
    → GPT with ORCHESTRATOR_TOOLS (5 tools: CreateVisualization, FilterData,
      FreeTextExplain, ClarifyVariable, Rebuff)
    → Dispatch each tool call to its handler
    → Return OrchestratorResult(tool_calls, orchestrator_choice)

Visualization Generation

Executes a two-step markdown skill plan via generate_vis_spec (vis_generate.py):

  1. generate — LLM produces a UDI Grammar spec from the request, schema, and few-shot examples
  2. validate — JSON schema check with a bounded repair-retry loop

Skills live in src/udiagent/data/skills/*.md (YAML frontmatter + prompt body).

Design Principles

  • Stateless — All context travels in message history; no server-side session state
  • Skills as Markdown — Prompt templates live in .md files with YAML frontmatter
  • Per-request key override — Supports both default and per-request OpenAI API keys

Regenerating Template Visualizations and Tool Definitions

The vis pipeline uses two generated artifacts:

  • src/udiagent/data/skills/template_visualizations.json — template visualization specs
  • src/udiagent/generated_vis_tools.py — typed OpenAI function-calling tool definitions

To regenerate both in one step:

uv pip install -e ".[codegen]"
uv run python scripts/regenerate_vis_tools.py

By default this uses data/data_domains/hubmap_data_schema.json as the schema. To use a different schema:

uv run python scripts/regenerate_vis_tools.py --schema data/data_domains/SenNet_domains.json

Benchmarking

Step 0: Start the API server

uv run fastapi dev src/udiagent/server/app.py --port 8007 &

Step 1: Run tiny benchmark (1 example)

uv run python -m udiagent.benchmark.runner --no-orchestrator --path ./data/benchmark_dqvis/tiny.jsonl

Step 2: Run small benchmark (100 examples)

uv run python -m udiagent.benchmark.runner --no-orchestrator --path ./data/benchmark_dqvis/small.jsonl --workers 5

Resume a failed run:

uv run python -m udiagent.benchmark.runner --path ./data/benchmark_dqvis/small.jsonl --workers 5 --resume ./out/<TIMESTAMP>/benchmark_results.json

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udiagent-0.2.0.tar.gz (28.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

udiagent-0.2.0-py3-none-any.whl (78.8 kB view details)

Uploaded Python 3

File details

Details for the file udiagent-0.2.0.tar.gz.

File metadata

  • Download URL: udiagent-0.2.0.tar.gz
  • Upload date:
  • Size: 28.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for udiagent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2ceabeee8393b270cb8e233c6c5a33a2429a743c3da3cc165cb4411177b1cc1a
MD5 66a30b6915908fbf982d17fb971d0fcf
BLAKE2b-256 f7b9410ef9d53b63fa15540d51a6f8fc6b50858cfa3670b6ea9701fcc9c932d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for udiagent-0.2.0.tar.gz:

Publisher: publish.yaml on hms-dbmi/UDIAgent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file udiagent-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: udiagent-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 78.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for udiagent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4defe013776b6de7f67f3683b27a97bd836a0c021ab489fcea8a75f589badd0
MD5 763465e2ec5cde1f4da90ec2f02e1e9d
BLAKE2b-256 6c396be9d1f75717fd7493e17f1a24b1dd006f2cc65c28a530d7cffbc6fc1772

See more details on using hashes here.

Provenance

The following attestation bundles were made for udiagent-0.2.0-py3-none-any.whl:

Publisher: publish.yaml on hms-dbmi/UDIAgent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page