Skip to main content

OpenTelemetry GenAI spans → queryable Neo4j graph. Mixed-vendor cost attribution, agent delegation, tool blast radius — one command, zero Cypher.

Project description

otel-genai-graph

Turn your OpenTelemetry GenAI traces into a graph you can actually ask questions of. Cost per model, agent delegation, tool blast radius, session structure — one command each, zero Cypher, mixed vendors, real tokens.

Demo — running the cost_by_model saved query

Multi-agent delegation, rendered from the multi_agent fixture

See it in one command

Once you've loaded the bundled fixtures plus any live captures you've made, this is cross-vendor cost attribution in a single command:

$ python tools/render_graph.py --from-neo4j --query cost_by_model --format table

provider   model                   calls  input_tokens  output_tokens
---------  ----------------------  -----  ------------  -------------
anthropic  claude-sonnet-4-5       7      1760          830
openai     qwen2.5:7b              7      541           75
anthropic  claude-opus-4-7         1      500           300
openai     text-embedding-3-small  1      20            0
openai     nomic-embed-text        1      7             0

That row mixing Anthropic (synthetic fixtures) with openai/qwen2.5:7b (real local Ollama) and openai/nomic-embed-text (real embeddings) is the point — every provider, classic or cutting-edge, ends up on the same graph coordinate system.

What it does

  • Ingests OTLP/JSON resourceSpans from any GenAI-instrumented app — OpenAI, Anthropic, Gemini, Azure OpenAI (classic / v1 / Foundry unified), Ollama, Groq, and anything OpenAI-SDK-compatible.
  • Maps to a typed graph: SessionOperationModel / Tool / DataSource, AgentINVOKEDOperation, AgentDELEGATED_TOAgent.
  • Writes idempotently to Neo4j via MERGE on natural keys. Re-ingesting the same trace is a no-op.
  • Streams live or loads files — ships a SpanExporter (plug into any TracerProvider) and a file loader (python -m otel_genai_graph.load trace.json …).
  • Explores without Cypher — a library of 10 saved queries (cost_by_model, session_tree, agent_delegation, failed_tools, …) exports to interactive HTML (cytoscape.js, single file, no install), node-link JSON, GraphML, or CSV/ASCII tables.
  • Speaks legacy and canonical v1.37gen_ai.system → gen_ai.provider.name, generate_content → chat, plus a priority-ordered conversation.id fallback list (session.id, langsmith.trace.session_id, traceloop.association.properties.*). Canonical emitters always win; everything else gets pulled into the same graph.
  • Enforces 7 shape-independent invariants on every graph — edge endpoint types, Session uniqueness, DAG property of parent / delegation edges, no orphan Models / Tools / DataSources, token-count and time-ordering sanity.

60-second quickstart

git clone https://github.com/kums1234/otel_genai_graph_exporter.git && cd otel-genai-graph
python3 -m venv .venv && . .venv/bin/activate
pip install -e ".[dev]"
pytest                                    # 833 unit/integration tests

# optional: live Neo4j + load the six sample fixtures
docker run -d --name otel-neo4j \
    -p 17474:7474 -p 17687:7687 \
    -e NEO4J_AUTH=neo4j/testtest neo4j:5

# configure connection + API keys once — shell env vars always win
cp .env.example .env          # then uncomment / fill in what you need

python -m otel_genai_graph.load tests/fixtures/*.json
# explore — no Cypher required:
python tools/render_graph.py --from-neo4j --query overview \
    --output /tmp/overview --format html && open /tmp/overview.html

Configuration

Every CLI (loader, tools/render_graph.py, capture scripts) reads its settings from environment variables. Copy .env.example.env at the project root, uncomment the blocks you need, and they get auto-loaded.

  • Works from any working directory — the loader walks up to find .env.
  • Shell env always wins (override=False). CI / container env vars can't be shadowed by a committed or stale .env.
  • Soft dep: python-dotenv is in dependencies; if it's ever missing (minimal container), load_env() silently no-ops and you fall back to explicit exports.
  • .env is gitignored (.env, .env.local); only .env.example is committed.

Minimum viable .env for the quickstart:

NEO4J_URI=bolt://localhost:17687
NEO4J_USER=neo4j
NEO4J_PASSWORD=testtest

Explore without Cypher

Most users don't want to write Cypher. tools/render_graph.py bundles a curated library of the questions people actually ask, and exports the answer as an interactive HTML viewer, a structured JSON/GraphML file, or a plain ASCII table.

Discover

python tools/render_graph.py --list-queries
python tools/render_graph.py --list-queries --tag cost
python tools/render_graph.py --describe-query session_tree

The v0.1 library (10 queries, full list in docs/saved-queries.md):

Question Query name Output type
Which agents delegated to which? agent_delegation graph
What did I load? overview graph
Full hierarchy of one conversation session_tree graph
Which tools failed, and what did they touch? failed_tools graph
Agent ↔ DataSource access map data_source_usage graph
Token spend per (provider, model) cost_by_model table
Token spend per session cost_by_session table
Token spend per agent (incl. delegated sub-agents) cost_by_agent table
Tool call counts, ranked tool_usage table
Calls by vendor provider_distribution table

Sample: "where is my token spend going?"

python tools/render_graph.py --from-neo4j --query cost_by_model --format table

With the bundled fixtures + a bit of real Ollama traffic loaded:

provider   model                   calls  input_tokens  output_tokens
---------  ----------------------  -----  ------------  -------------
anthropic  claude-sonnet-4-5       7      1760          830
openai     qwen2.5:7b              7      541           75
anthropic  claude-opus-4-7         1      500           300
openai     text-embedding-3-small  1      20            0
openai     nomic-embed-text        1      7             0

Pipe it into anywhere — spreadsheets, BI, Slack, a Grafana panel:

python tools/render_graph.py --from-neo4j --query cost_by_model --format csv    > cost.csv
python tools/render_graph.py --from-neo4j --query cost_by_model --format jsonl  | jq '.'

Sample: "multi-turn session across three traces, collapsed"

Multi-turn session merge rendered from a real Ollama capture

python tools/render_graph.py --from-neo4j \
    --query session_tree --param session_id=ollama-multi-demo \
    --output /tmp/multi --format html
open /tmp/multi.html

Three chat turns, three distinct trace_ids, one shared gen_ai.conversation.id — collapsed to a single Session node by the mapper. This is what you can't do with a linear tracer: a turn-by-turn graph that persists identity across retries, async agents, and background batches. Real capture, not a synthetic fixture.

Sample: "show me exactly what happened in that tool call"

Real tool_call flow rendered from tests/capture_real_traces.py --shape tool_call

python tools/render_graph.py --from-neo4j \
    --query session_tree --param session_id=real-1776842869-0 \
    --output /tmp/tool --format html
open /tmp/tool.html

A real qwen2.5:7b on local Ollama requested get_current_time, it was evaluated locally, the model produced the final answer. The graph shows the whole flow: AgentINVOKEDinvoke_agentPARENT_OF→ two chat + one execute_tool, with EXECUTED and CALLED edges fanning out to the shared Model and Tool nodes. 8 nodes, 11 edges, 0 invariant violations.

Every HTML output is one self-contained file (cytoscape.js loaded from CDN, everything else inline). Email it to a non-technical stakeholder; they get pan / zoom / hover / click for node properties. No server, no Neo4j required on the receiving end.

Sample: any fixture, offline, no Neo4j

python tools/render_graph.py --fixture tests/fixtures/multi_agent.json \
    --output /tmp/multi-agent --format all
# → /tmp/multi-agent.{dot,html,json,graphml} in one shot

--format all covers the four dep-free formats. Add .svg / .png output with brew install graphviz (or apt-get install graphviz).

Output formats

Format Flag Shape When to use
html --format html graph Share with humans. Interactive, single file.
json --format json graph Feed into D3, Observable, another graph viz, or your tests.
graphml --format graphml graph Gephi / yEd for force-directed layout + centrality metrics.
dot --format dot graph Pipe into your own dot for PDFs / papers.
svg/png --format svg graph Docs / READMEs. Requires graphviz installed.
table --format table table Terminal output for aggregations.
csv --format csv table Spreadsheets, BI tools.
jsonl --format jsonl table Pipe into jq, stream-processors, dashboards.

The CLI refuses to render a table query as HTML or a graph query as CSV, so you can't accidentally produce junk output.

Custom Cypher — for when the saved library isn't enough

python tools/render_graph.py --from-neo4j \
    --cypher "MATCH (a:Agent)-[:DELEGATED_TO*]->(b:Agent) RETURN a, b" \
    --output /tmp/delegations --format html

Graph schema (v0.1)

Node Natural key Sourced from
Session id gen_ai.conversation.id (+ legacy fallbacks)
Agent id gen_ai.agent.id
Model (provider, name) gen_ai.provider.name + gen_ai.response.model
Tool name gen_ai.tool.name
DataSource id gen_ai.data_source.id
Operation span_id span id
Edge Direction Emitted when
CONTAINS Session → Operation Operation carries a conversation id
EXECUTED Operation → Model Op type ∈ {chat, text_completion, embeddings}
INVOKED Agent → Operation Op type ∈ {invoke_agent, create_agent}
CALLED Operation → Tool Op type = execute_tool
RETRIEVED_FROM Operation → DataSource Op carries gen_ai.data_source.id
PARENT_OF Operation → Operation span has parent_span_id
DELEGATED_TO Agent → Agent child invoke_agent under a different parent Agent
ACCESSED Agent → DataSource agent-owned Op retrieves from a data source

See docs/schema.md for the full reference, extension seams, and open questions for v0.2.

Raw Cypher (power users)

If the saved-query library doesn't cover your question, the Neo4j browser (http://localhost:17474) is always available. A handful of useful queries for reference, each also available as a saved query:

// agent delegation graph across all loaded traces
// (saved query: agent_delegation)
MATCH p = (a:Agent)-[:DELEGATED_TO*]->(b:Agent) RETURN p

// cost / token attribution per (provider, model)
// (saved query: cost_by_model)
MATCH (o:Operation)-[:EXECUTED]->(m:Model)
RETURN m.provider, m.name,
       count(*)                  AS calls,
       sum(o.input_tokens)       AS in_tok,
       sum(o.output_tokens)      AS out_tok
ORDER BY calls DESC

// blast radius, which agents/sessions did a failing tool touch?
// (saved query: failed_tools — graph shape)
MATCH (t:Tool)<-[:CALLED]-(op:Operation {status:"ERROR"})
      <-[:PARENT_OF*]-(ancestor:Operation)
      <-[:CONTAINS]-(s:Session)
RETURN t, s, collect(DISTINCT ancestor) AS affected_ops

To run one-off ad-hoc Cypher through the export pipeline (HTML/JSON/etc.):

python tools/render_graph.py --from-neo4j \
    --cypher "MATCH (o:Operation {status:'ERROR'}) RETURN o" \
    --output /tmp/errors --format html

Validation pipeline

Four independent data sources catch different bug classes:

  1. Hand-written fixtures, 6 canonical cases in tests/fixtures/*.json. Each ships an expected_graph block with hand-counted node/edge totals; the mapper has to match exactly.
  2. Parametric synthesizer, tests/generate_traces.py emits thousands of deterministic OTLP traces across five shapes (simple, agent_tool, multi_agent, rag, multi_turn) with a chaos mode that introduces dropped attributes, orphaned children, reordered spans, and corrupted trace_ids. The mapper must stay correct on clean traces and must not crash on chaos.
  3. Real-SDK capture, tests/capture_real_traces.py calls actual LLMs (Anthropic, OpenAI, Gemini, Azure OpenAI classic / v1 / Foundry, Ollama) with a BudgetGuard that refuses to run past a cap. Five shapes: chat, agent_tool, embeddings, multi_turn, tool_call (with real tool execution).
  4. Upstream-instrumentor capture, tests/capture_with_instrumentor.py runs spans through Python Contrib's own OpenAIInstrumentor / GoogleGenAiSdkInstrumentor and dumps what they emit. This is how we found that the Google instrumentor emits the pre-v1.37 gen_ai.system="gemini" + operation.name="generate_content" shape, now handled by the mapper's legacy-compat table.

All four feed into a shape-independent invariant test suite that checks 7 structural properties across every captured graph. See docs/mapping.md for the full contract and tests/test_invariants.py for the enforcement.

Supported providers

Out of the box, with the bundled capture scripts:

Provider CLI --provider value Notes
Anthropic anthropic ANTHROPIC_API_KEY
OpenAI openai OPENAI_API_KEY (+ OPENAI_BASE_URL for Ollama/Groq)
Azure OpenAI azure_openai classic /openai/deployments/… path
Azure OpenAI v1 azure_openai_v1 new /openai/v1/ path (Foundry portal)
Azure AI Inference azure_inference Foundry unified inference endpoint
Google AI Studio google GEMINI_API_KEY; free tier works
Ollama openai + OPENAI_BASE_URL=http://localhost:11434/v1/ local, unlimited, $0

Free paths: Google AI Studio (no credit card, 1,500 requests/day on Flash), Ollama (local), Groq free tier.

Streaming export

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from otel_genai_graph.exporter import Neo4jGenAIExporter
from otel_genai_graph.neo4j_sink import Neo4jSink

sink = Neo4jSink("bolt://localhost:17687", "neo4j", "testtest")
sink.connect()
sink.ensure_schema()

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(Neo4jGenAIExporter(sink)))
trace.set_tracer_provider(provider)
# any instrumented code below this line streams into the graph

See tests/live_export_demo.py for a runnable walkthrough.

Status

v0.1, reference implementation, 833 tests passing. All components live: schema, mapper (with legacy-compat canonicalisation), Neo4j sink (MERGE-based, idempotent), SpanExporter, file loader, cost table, synthesizer, real-API capture, upstream-instrumentor capture, invariants, saved-query library, HTML / JSON / GraphML / CSV / ASCII-table export.

Known open questions, tracked in docs/schema.md:

  • Resource nodes aren't currently emitted (schema has the label; mapper skips it to keep v0.1 simple).
  • Streaming spans (partial output): we take the final span only.
  • Cost attribution lives as Operation properties, not a dedicated Budget node, deferred until a concrete UX asks for it.

Contributing

See CONTRIBUTING.md for the dev loop, adding fixtures, adding provider adapters, and the invariant contract.

Contributors

Maintained by @kums1234. Any non-trivial contribution — code, docs, bug report, test fixture, real-trace capture, provider adapter, design feedback — earns a row in CONTRIBUTORS.md. Instructions for getting listed are in that file.

License

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

otel_genai_graph-0.1.0.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

otel_genai_graph-0.1.0-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file otel_genai_graph-0.1.0.tar.gz.

File metadata

  • Download URL: otel_genai_graph-0.1.0.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for otel_genai_graph-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c8a7d093a53c211b4baad0bc67c0c6470fe3a4da4f862b810eaa155531886a7e
MD5 e96ec9beca780f58281e5131bb5657c8
BLAKE2b-256 af338a6b8db472333593577ddf9c60881b2a3a1df8e061c1285bd2257a724e30

See more details on using hashes here.

File details

Details for the file otel_genai_graph-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for otel_genai_graph-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d2f76d43bcaa40e8bfbf4d22fc5aecdff4118f16ee03cb12bd2becc8b34af0ba
MD5 c02b3be94aa4ef43a046b16110564424
BLAKE2b-256 3499acf2e1a4b6beee5825e38d69e3da832556ebe0d9fcf819e81069f92b3f26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page