A Model Context Protocol (MCP) server for Grafana Tempo distributed tracing and observability.
Project description
Tempo MCP Server
An MCP server that gives AI assistants the power to search, analyze, summarize, and correlate distributed traces from Grafana Tempo — with TraceQL query construction, RED metrics analysis, cross-pillar pivots, service topology mapping, and operational diagnostics.
Quick Start · Docs · Report Bug · Request Feature
Why Tempo MCP Server?
The problem: Grafana Tempo is a powerful distributed tracing backend, but effective trace analysis is complex. Constructing TraceQL queries requires knowledge of attribute scopes, intrinsic fields, and structural operators. Correlating traces with metrics (RED analysis) and logs requires multi-step pivots across different APIs. Diagnosing latency spikes means navigating critical paths, identifying root causes, and finding related incidents — each a specialized skill. When AI assistants try to help, they hallucinate TraceQL syntax, miss multi-tenant requirements, or generate unbounded queries that overwhelm backends.
The solution: The Tempo MCP Server gives AI assistants (like Claude, Cline, or Cursor) structured, safe tools to interact with Grafana Tempo natively:
- Smart Trace Search: Say "Find error traces from the API service in production" and the AI auto-translates K8s-friendly filters (namespace, service, deployment) into valid TraceQL, enforces query guardrails (time ranges, limits), and returns compact summaries.
- Intelligent Trace Analysis: The AI fetches a trace, extracts the critical path, identifies error spans, detects the suspected root cause, and recommends follow-up queries — all in a single
tempo_summarize_tracecall. - Metrics-First Triage: Execute RED metrics queries (rate, errors, duration) using TraceQL metrics functions like
rate(),quantile_over_time(), andcount_over_time()— then pivot from aggregated metrics to concrete traces via exemplars. - Cross-Pillar Correlation: Extract trace IDs from log lines and retrieve the full trace. Pivot from metrics spikes to exemplar traces. Correlate related traces using strategies like same-service errors, same-endpoint, or temporal neighbors.
- Backend Diagnostics: Aggregate health checks, build info, component service status, and ring member health into a single curated diagnostics report with severity-ranked findings and remediation steps.
- Service Topology: Map service dependencies from Tempo's metrics-generator service graph data, with request rates and error rates per edge.
Key Features
TraceQL Search with K8s-Friendly Filters
- Raw TraceQL queries or structured K8s filters (namespace, service, deployment, cluster)
- Auto-translation of K8s concepts to OTel attributes via canonical mapping
- Query guardrails: time range enforcement, limit clamping, empty-query rejection
- Basic TraceQL validation before sending to backend
- Non-determinism awareness in result metadata
Trace Retrieval & Analysis
- Single-trace fetch with LLM-optimized format support (Tempo 2.9+
application/vnd.grafana.llm) - Automatic fallback to standard OTLP JSON when LLM format is unavailable
- Server-side trace summarization: critical path extraction, error detection, root cause analysis, and recommended next queries
- Time gap detection: disambiguates wall-clock duration from critical path duration when async/disjointed spans inflate the trace window
- Related trace correlation with three strategies: same-service errors, same-endpoint, temporal neighbors
Schema Discovery
- Attribute name discovery across scopes (resource, span, intrinsic, event, link, instrumentation)
- Attribute value enumeration with time-window scoping and TraceQL filtering
- Canonical K8s-to-Tempo attribute mapping with optional live validation against a backend
TraceQL Metrics
- Range queries returning Prometheus-compatible time series (matrix format)
- Instant queries returning point-in-time metrics (vector format)
- Support for
rate(),count_over_time(),avg_over_time(),max_over_time(),min_over_time(),sum_over_time(),quantile_over_time(),histogram_over_time()
Cross-Pillar Pivots
- Metrics-to-traces: extract exemplar trace IDs from TraceQL metrics queries
- Logs-to-traces: parse trace IDs from log lines (supports
trace_id=,traceId:,TraceID=, standalone 32-char hex) and retrieve full traces
Backend Discovery & Diagnostics
- Multi-backend support with per-backend health probing
- Kubernetes service discovery (label-based + Tempo Operator CRDs: TempoStack, TempoMonolithic)
- Comprehensive diagnostics: readiness, build info, component services, ring status
- Severity-ranked findings with actionable remediation steps
Service Topology
- Service dependency mapping from
traces_service_graph_request_totalmetrics - Request rate and error rate per service edge
- Service-focused filtering for targeted topology views
Multi-Tenancy
- Per-backend tenant header injection (
X-Scope-OrgID) - Cross-tenant queries via pipe-separated tenant IDs
- Tenant ID validation (max 150 bytes, restricted charset)
- Graceful handling of single-tenant and multi-tenant backends
Production-Ready Middleware
- Response limiting (100KB max), rate limiting (10 req/s, burst 20)
- Response caching (10s for tools, 30s for resources, 5min for listings)
- Structured logging, error handling, timing
Architecture
┌─────────────────────────┐
│ MCP Client │
│ (Claude, Cline, Cursor) │
└──────────┬──────────────┘
│
┌──────────▼──────────────┐
│ FastMCP Server Core │
│ (HTTP / SSE / stdio) │
│ + Middleware Stack │
└──────────┬──────────────┘
│
┌────────────┬──────────┼──────────┬────────────┐
│ │ │ │ │
┌────▼────┐ ┌────▼────┐ ┌───▼───┐ ┌────▼────┐ ┌────▼────┐
│ Tools │ │Resources│ │Prompts│ │ Utils │ │ Models │
│ (16) │ │ (11) │ │ (5) │ │ │ │ │
└────┬────┘ └────┬────┘ └───────┘ └─────────┘ └─────────┘
│ │
└──────┬─────┘
│
┌──────────▼──────────┐
│ Service Layer │
│ │
│ tempo_service │
│ kubernetes_service │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Tempo HTTP API │
│ + K8s Discovery │
└─────────────────────┘
How it works:
- An AI assistant connects via HTTP, SSE, or stdio.
- The AI loads
tempo://system/backendsresource to discover available Tempo backends and their health. - Tools interact with Tempo's HTTP API to search traces, compute metrics, and run diagnostics.
- The service layer (
tempo_service) handles HTTP calls with connection pooling, tenant injection, and LLM format negotiation. - Optional Kubernetes discovery (
kubernetes_service) finds Tempo services via labels and Tempo Operator CRDs. - Middleware enforces rate limiting, response size caps, caching, and structured logging.
Table of Contents
- Why Tempo MCP Server?
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Configuration
- Available Tools
- Available Resources
- Available Prompts
- Usage
- Project Structure
- Roadmap
- Contributing
- FAQ
- Troubleshooting
- Security Considerations
- License
- Contact
- Acknowledgments
Tech Stack
| Category | Technologies |
|---|---|
| Language | Python 3.12+ |
| MCP Framework | FastMCP ≥2.13.3 |
| Protocol | Model Context Protocol (MCP) |
| Tracing Backend | Grafana Tempo HTTP API |
| HTTP Client | httpx — async, connection pooling |
| Kubernetes | Python K8s Client · Tempo Operator CRDs |
| Transport | HTTP · SSE · Streamable-HTTP · stdio |
| Infrastructure | Docker · uv |
Getting Started
Prerequisites
- Docker (recommended) or Python 3.12+ (for local dev)
- Grafana Tempo backend accessible via HTTP (monolithic or microservices mode)
- Optionally: Kubernetes with the Tempo Operator for auto-discovery
Quick Start with Docker (recommended)
docker run --rm -it \
-p 8768:8768 \
-e MCP_TRANSPORT=http \
-e TEMPO_BASE_URL=http://host.docker.internal:3200 \
talkopsai/tempo-mcp-server:latest
The server is now listening on http://localhost:8768/mcp.
Point your MCP client at it:
{
"mcpServers": {
"tempo": {
"url": "http://localhost:8768/mcp",
"description": "MCP Server for Grafana Tempo distributed tracing"
}
}
}
From Source (Python)
-
Install uv for dependency management.
-
Clone and set up:
git clone https://github.com/talkops-ai/talkops-mcp.git
cd talkops-mcp/src/tempo-mcp-server
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
- Configure your
.env:
TEMPO_BASE_URL=http://localhost:3200
MCP_TRANSPORT=http
MCP_LOG_LEVEL=INFO
- Run the server:
uv run tempo-mcp-server
Or, with the venv activated: tempo-mcp-server.
- Run tests:
source .venv/bin/activate
pytest tests/
Configuration
All configuration is via environment variables (loaded from .env via python-dotenv).
Server Configuration
| Variable | Default | Description |
|---|---|---|
MCP_SERVER_NAME |
tempo-mcp-server |
Server name identifier |
MCP_SERVER_VERSION |
0.1.0 |
Server version string |
MCP_TRANSPORT |
stdio |
Transport mode: http, sse, streamable-http, or stdio |
MCP_HOST |
0.0.0.0 |
Host address for HTTP server |
MCP_PORT |
8768 |
Port for HTTP server |
MCP_PATH |
/mcp |
MCP endpoint path |
MCP_LOG_LEVEL |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR |
MCP_LOG_FORMAT |
json |
Log format: json or text |
MCP_HTTP_TIMEOUT |
300 |
HTTP server timeout (seconds) |
MCP_HTTP_KEEPALIVE_TIMEOUT |
5 |
HTTP keepalive timeout (seconds) |
MCP_HTTP_CONNECT_TIMEOUT |
60 |
HTTP connect timeout (seconds) |
Tempo Backend (Single Backend Mode)
| Variable | Default | Description |
|---|---|---|
TEMPO_BASE_URL |
http://localhost:3200 |
Tempo HTTP API base URL |
TEMPO_BACKEND_ID |
default |
Backend identifier |
TEMPO_DISPLAY_NAME |
(empty) | Human-readable backend name |
TEMPO_TYPE |
tempo |
Backend type: tempo, tempo-gateway, unknown |
TEMPO_DEPLOYMENT_MODE |
unknown |
Deployment mode: monolithic, microservices, unknown |
TEMPO_AUTH_HEADER |
(empty) | Authorization header value (e.g., Bearer <token>) |
TEMPO_VERIFY_SSL |
true |
Verify SSL certificates |
TEMPO_TIMEOUT |
30 |
HTTP timeout per request (seconds) |
Tempo Backend (Multi-Backend Mode)
| Variable | Default | Description |
|---|---|---|
TEMPO_BACKENDS |
(empty) | JSON array of backend configs (overrides single backend). See .env.example. |
Multi-Tenancy
| Variable | Default | Description |
|---|---|---|
TEMPO_MULTI_TENANT |
false |
Enable multi-tenant mode for the backend |
TEMPO_DEFAULT_TENANT |
(empty) | Default tenant ID (required if TEMPO_MULTI_TENANT=true) |
TEMPO_TENANT_HEADER |
X-Scope-OrgID |
HTTP header name for tenant ID injection |
Query Policies / Guardrails
| Variable | Default | Description |
|---|---|---|
TEMPO_MAX_LOOKBACK |
168h |
Maximum query lookback (7 days) |
TEMPO_DEFAULT_SEARCH_LIMIT |
20 |
Default max traces per search |
TEMPO_MAX_SEARCH_LIMIT |
100 |
Absolute max traces per search |
TEMPO_DEFAULT_SPSS |
3 |
Default spans per span-set |
TEMPO_MAX_SPSS |
10 |
Maximum spans per span-set |
TEMPO_REQUIRE_TIME_RANGE |
true |
Require time range on searches |
TEMPO_REQUIRE_FILTER_OR_QUERY |
true |
Require at least one filter or TraceQL query |
TEMPO_DEFAULT_METRICS_SAMPLING |
(empty) | Default metrics sampling rate (e.g., fixed-span:0.1) |
TEMPO_MAX_METRICS_DURATION |
3h |
Maximum allowed metrics query time range. Should match Tempo's query_frontend.metrics.max_duration. |
LLM Format
| Variable | Default | Description |
|---|---|---|
TEMPO_LLM_FORMAT |
true |
Enable LLM-optimized trace format (Tempo 2.9+ application/vnd.grafana.llm) |
Kubernetes Discovery
| Variable | Default | Description |
|---|---|---|
K8S_ENABLED |
false |
Enable Kubernetes-based Tempo backend discovery |
K8S_CONTEXT |
(empty) | Specific kubeconfig context to use |
K8S_IN_CLUSTER |
false |
Set true when running inside a Kubernetes pod |
Tempo Operator CRD
| Variable | Default | Description |
|---|---|---|
TEMPO_CRD_GROUP |
tempo.grafana.com |
Tempo Operator CRD API group |
TEMPO_CRD_API_VERSION |
v1alpha1 |
CRD API version (change when Operator graduates to v1) |
Available Tools
Discovery
| Tool | Description |
|---|---|
tempo_list_backends |
List all configured Tempo backends with health status (ready/not_ready). Use this first to discover available backends. |
tempo_get_backend |
Get detailed profile for a specific backend: health, version, build info, capabilities, deployment mode, tenant requirements, and component service statuses. |
tempo_get_query_policies |
Get query guardrails and default search parameters: max lookback, search limits, SPSS limits, and time range requirements. |
Schema Discovery
| Tool | Description |
|---|---|
tempo_get_attribute_names |
Discover available trace attribute names from a Tempo backend, grouped by scope (resource, span, intrinsic, event, link, instrumentation). Supports time-window scoping. |
tempo_get_attribute_values |
Get distinct values for a specific trace attribute. Useful for understanding data distribution and building dynamic filters. Supports TraceQL scoping. |
tempo_get_k8s_attribute_map |
Get the canonical mapping between Kubernetes concepts (namespace, pod, deployment) and their OTel/Tempo attribute names. Optionally validates against a live backend's tag list. |
Search & Retrieval
| Tool | Description |
|---|---|
tempo_traceql_search |
HIGH-INTENT: Search for traces using raw TraceQL or K8s-friendly filters (namespace, service, deployment, cluster, status, duration). Auto-translates filters to TraceQL, enforces query guardrails, and returns compact summaries. |
tempo_get_trace |
Retrieve a single trace by ID with LLM-optimized format. Attempts application/vnd.grafana.llm first, falls back to standard OTLP JSON. |
tempo_query_a2ui |
Retrieve a trace heavily optimized and structured for A2UI rendering. DAG-aware pruning enforces payload limits while preserving critical paths and parent-child linkages. |
tempo_summarize_trace |
HIGH-INTENT: Generate an intelligent summary of a trace — critical path extraction, error detection, suspected root cause, K8s context, time gap detection (wall-clock vs. critical path disambiguation), and recommended next queries. Primary analysis primitive. |
tempo_find_related_traces |
HIGH-INTENT: Find traces related to a seed trace using correlation strategies: same_service_errors, same_endpoint, or temporal_neighbors. One call replaces manual multi-step correlation. |
Metrics
| Tool | Description |
|---|---|
tempo_traceql_metrics_range |
Execute a TraceQL metrics range query. Returns Prometheus-compatible time series (matrix). Use for RED metrics, trend analysis, and SLO calculations. Supports rate(), count_over_time(), quantile_over_time(), etc. |
tempo_traceql_metrics_instant |
Execute a TraceQL metrics instant query. Returns point-in-time metrics (vector). |
Cross-Pillar Pivots
| Tool | Description |
|---|---|
tempo_get_exemplar_traces |
Pivot from aggregated metrics to concrete traces. Extracts exemplar trace IDs from a TraceQL metrics query result. |
tempo_get_trace_from_log |
Extract a trace ID from a log line (supports multiple formats) and retrieve + summarize the associated trace. One call replaces parse → fetch → analyze. |
Diagnostics
| Tool | Description |
|---|---|
tempo_get_diagnostics |
HIGH-INTENT: Comprehensive backend diagnostics. Aggregates health check, build info, component service status, and ring member health into a curated report with severity-ranked findings and suggested actions. |
Topology
| Tool | Description |
|---|---|
tempo_get_service_dependencies |
Map service dependencies from Tempo's metrics-generator service graph data. Returns nodes and edges with request rates. Supports service-focused filtering. |
Operator CRD Management
| Tool | Description |
|---|---|
tempo_list_operator_crs |
List Tempo Operator custom resources (TempoStack, TempoMonolithic) across namespaces with status. Read-only. |
tempo_get_operator_cr |
Get a Tempo Operator CR with full spec, status, conditions, and storage configuration. Read-only. |
tempo_create_operator_cr |
Create a TempoStack or TempoMonolithic CR. Generates complete CRD manifest with storage, retention, resources. dry_run=True by default. |
tempo_patch_operator_cr |
Patch specific fields of an existing Tempo Operator CR (retention, resources, search). dry_run=True by default. |
Trace Comparison
| Tool | Description |
|---|---|
tempo_compare_traces |
HIGH-INTENT: Compare two traces and report structural + timing + error + attribute differences. 5-dimensional diff: services, span counts, durations, errors, attributes. |
Alerting Expression Generator
| Tool | Description |
|---|---|
tempo_generate_alerting_expression |
Generate PromQL alerting expressions from trace patterns using spanmetrics. Returns ready-to-paste PrometheusRule YAML. Cross-MCP workflow: pass output to prom_upsert_rule_group. |
Available Resources
Dynamic Resources
| Resource URI | Description |
|---|---|
tempo://system/backends |
All configured Tempo backends with health status |
tempo://system/backends/{backend_id} |
Detailed profile for a specific Tempo backend |
tempo://deployment/overview |
Deployment topology: backends, modes, tenants, K8s integration status |
Reference Resources (Static)
| Resource URI | Description |
|---|---|
tempo://reference/traceql |
TraceQL syntax reference: selectors, operators, intrinsics, scoped attributes, structural queries, examples |
tempo://reference/traceql-metrics |
TraceQL metrics functions: rate, count_over_time, quantile, histogram, grouping, aggregations, sampling |
tempo://reference/k8s-attributes |
Canonical K8s-to-Tempo attribute mapping for Kubernetes observability |
tempo://reference/query-policies |
Query guardrails, limits, continuation strategy, and safety guidelines (dynamically populated from config) |
Runbook Resources
| Resource URI | Description |
|---|---|
tempo://runbooks/latency-spike |
Step-by-step runbook for investigating latency spikes: detect → locate → analyze → correlate → root cause |
tempo://runbooks/error-burst |
Step-by-step runbook for investigating error bursts: quantify → search → triage → correlate |
tempo://runbooks/no-traces-found |
Diagnostic runbook for "no traces found" scenarios: backend health → data existence → scope checks → ingestion |
tempo://runbooks/cross-tenant-access |
Runbook for cross-tenant query configuration, usage, and constraints |
Example Resources
| Resource URI | Description |
|---|---|
tempo://examples/common-queries |
Common TraceQL and metrics query examples for quick reference: service exploration, error investigation, performance analysis, structural queries, metrics queries |
Available Prompts
Guided workflow prompts that orchestrate multiple tools into step-by-step journeys:
| Prompt Name | Description | Parameters |
|---|---|---|
tempo-error-triage |
Guided 4-phase error triage: quantify impact (error rate vs. baseline), find error traces, analyze root cause via summarization + correlation, contextualize with diagnostics | backend_id, service, namespace |
tempo-latency-investigation |
Guided 4-phase latency investigation: confirm spike (P99 trend), find slow traces above threshold, critical path analysis via summarization, compare with normal traces | backend_id, service, threshold_ms |
tempo-missing-traces |
Guided 4-phase diagnostic for "no traces found": verify backend health, verify data exists (attribute names, broadest search), check scope (tenant, namespace, service), consult runbook | backend_id, service |
tempo-traceql-builder |
Interactive TraceQL query construction: parse user intent, discover available attributes, construct query using reference, execute, and refine | backend_id, intent |
tempo-metrics-first-triage |
RED metrics-first triage for a service: rate, error rate, P99 duration, investigate anomalies, deep dive into individual traces | backend_id, service |
Usage
Supported workflows with prompt examples and links to detailed guides:
| Workflow | Prompt Example | Documentation |
|---|---|---|
| Error Triage | "Triage errors for the 'checkout-service' in the 'production' namespace using backend 'prod'." |
TEMPO_ERROR_TRIAGE_TEST_GUIDE.md |
| Latency Investigation | "Investigate latency spikes above 500ms for 'api-gateway' using backend 'prod'." |
TEMPO_LATENCY_INVESTIGATION_TEST_GUIDE.md |
| Missing Traces | "No traces found for 'payment-service' — diagnose the issue on backend 'prod'." |
TEMPO_MISSING_TRACES_TEST_GUIDE.md |
| TraceQL Builder | "Build a TraceQL query to find slow database calls over 100ms in the frontend." |
TEMPO_TRACEQL_BUILDER_TEST_GUIDE.md |
| Metrics-First Triage | "Run a RED analysis for 'order-service' over the last 6 hours." |
TEMPO_METRICS_FIRST_TRIAGE_TEST_GUIDE.md |
Project Structure
tempo-mcp-server/
├── tempo_mcp_server/ # Main package
│ ├── tools/ # MCP Tools (10 tool groups, 23 tools)
│ │ ├── discovery/ # Backend listing, inspection, query policies
│ │ │ └── discovery_tools.py # 3 tools: list_backends, get_backend, get_query_policies
│ │ ├── schema/ # Attribute/tag discovery
│ │ │ └── schema_tools.py # 3 tools: get_attribute_names, get_attribute_values, get_k8s_attribute_map
│ │ ├── search/ # Trace search & retrieval
│ │ │ └── search_tools.py # 5 tools: traceql_search, get_trace, query_a2ui, summarize_trace, find_related_traces
│ │ ├── metrics/ # TraceQL metrics queries
│ │ │ └── metrics_tools.py # 2 tools: metrics_range, metrics_instant
│ │ ├── pivot/ # Cross-pillar correlation
│ │ │ └── pivot_tools.py # 2 tools: get_exemplar_traces, get_trace_from_log
│ │ ├── diagnostics/ # Backend health & diagnostics
│ │ │ └── diagnostics_tools.py # 1 tool: get_diagnostics
│ │ ├── topology/ # Service dependency mapping
│ │ │ └── topology_tools.py # 1 tool: get_service_dependencies
│ │ ├── operator/ # Tempo Operator CRD lifecycle
│ │ │ └── operator_tools.py # 4 tools: list_operator_crs, get_operator_cr, create_operator_cr, patch_operator_cr
│ │ ├── comparison/ # Trace comparison
│ │ │ └── comparison_tools.py # 1 tool: compare_traces
│ │ └── alerting/ # Alerting expression generation
│ │ └── alerting_tools.py # 1 tool: generate_alerting_expression
│ ├── resources/ # MCP Resources (11 URIs)
│ │ ├── backend_resources.py # Dynamic: backends listing, backend detail
│ │ ├── deployment_resources.py # Dynamic: deployment overview
│ │ ├── reference_resources.py # Static: TraceQL, metrics, K8s attributes, query policies
│ │ ├── runbook_resources.py # Static: latency spike, error burst, no traces, cross-tenant
│ │ └── examples_resources.py # Static: common TraceQL query examples
│ ├── prompts/ # MCP Prompts (5 guided workflows)
│ │ ├── query_prompts.py # TraceQL builder, metrics-first triage
│ │ └── troubleshooting_prompts.py # Error triage, latency investigation, missing traces
│ ├── services/ # Business logic
│ │ ├── tempo_service.py # Async HTTP client: all Tempo API calls, tenant injection,
│ │ │ # LLM format negotiation, connection pooling
│ │ └── kubernetes_service.py # K8s discovery & CRD management: service labels, Tempo Operator CRDs,
│ │ # create/patch TempoStack/TempoMonolithic
│ ├── server/ # FastMCP server setup
│ │ ├── core.py # Server creation & instructions loading
│ │ ├── bootstrap.py # Component initialization & DI
│ │ └── middleware.py # 7-layer middleware stack
│ ├── models/ # Pydantic data models
│ │ ├── search.py # SearchFilters, trace response models
│ │ ├── schema.py # Attribute scope definitions
│ │ ├── backend.py # Backend config models
│ │ ├── trace.py # Trace summary models
│ │ ├── metrics.py # Metrics response models
│ │ ├── pivot.py # Pivot response models
│ │ ├── topology.py # Topology models
│ │ ├── diagnostics.py # Diagnostics models
│ │ ├── operator.py # Tempo Operator CRD models
│ │ └── comparison.py # Trace comparison models
│ ├── utils/ # Helpers
│ │ ├── traceql_helpers.py # TraceQL construction, validation, K8s attribute mapping
│ │ ├── trace_summarizer.py # Critical path extraction, error detection, headline generation
│ │ ├── trace_differ.py # 5-dimensional trace diff engine
│ │ ├── trace_id_extractor.py # Regex-based trace ID parsing from log lines
│ │ └── time_helpers.py # Relative time parsing (1h, 24h, 7d → Unix epoch)
│ ├── static/ # Static data files
│ │ └── TEMPO_MCP_INSTRUCTIONS.md # MCP system instructions for AI agents
│ ├── exceptions/ # Custom exception hierarchy
│ │ └── custom.py # TempoOperationError, TempoQueryError, TempoTenantError, etc.
│ ├── config.py # Environment parsing & config dataclasses
│ └── main.py # Entry point & CLI
├── tests/ # Test suites
│ ├── unit/ # Unit tests (deterministic, mocked)
│ ├── integration/ # In-memory MCP integration tests
│ ├── fixtures/ # Test fixtures (JSON responses)
│ └── conftest.py # Shared test configuration
├── docs/ # Documentation & test guides
├── pyproject.toml # Package definition (Python 3.12)
├── Dockerfile # Docker build
└── README.md # This documentation
Roadmap
Shipped in this release:
- TraceQL search with K8s-friendly filters and query guardrails
- Intelligent trace summarization (critical path, error detection, root cause)
- Related trace discovery via correlation strategies
- Attribute name/value discovery with scope filtering and time-window scoping
- K8s-to-Tempo canonical attribute mapping with live validation
- TraceQL metrics: range and instant queries with Prometheus-compatible output
- Metrics-to-traces exemplar pivot
- Logs-to-traces pivot (multi-format trace ID extraction)
- Comprehensive backend diagnostics (readiness, build info, services, rings)
- Service topology mapping from metrics-generator data
- Multi-tenancy with tenant validation and cross-tenant support
- 5 guided workflow prompts (error triage, latency, missing traces, TraceQL builder, RED triage)
- 11 MCP resources (dynamic backends, static references, runbooks, examples)
- 7-layer middleware stack (error handling, response limiting, rate limiting, caching, logging, timing)
- Tempo Operator CRD management (list/get/create/patch TempoStack & TempoMonolithic)
- Trace comparison (diff two traces by ID — 5-dimensional structural analysis)
- Alerting expression generator (PromQL from trace patterns → cross-MCP workflow with Prometheus server)
Coming next:
- Multi-cluster support
- Trace diff visualization (HTML/Mermaid output for trace comparison)
- Batch trace analysis (compare N traces, detect outliers)
- Custom TraceQL metrics function library
See open issues for the full list of proposed features.
Contributing
Contributions are welcome. The process is straightforward:
- Fork the repo
- Create a branch (
git checkout -b feature/TraceComparison) - Make your changes and commit
- Push and open a PR
If you're considering something bigger, open an issue first so we can align on the approach.
FAQ
Which MCP clients work with this?
Any MCP-compatible client including Claude Desktop, Cline, Cursor, and custom clients. Connect viahttp://localhost:8768/mcp for HTTP transport, or configure stdio for direct process communication.
Does this require Grafana Tempo?
Yes. The server communicates with Tempo's HTTP API (/api/search, /api/v2/traces/{traceID}, /api/v2/search/tags, /api/metrics/query_range, etc.). Any Grafana Tempo deployment (monolithic, microservices, or via the Tempo Operator) will work. The LLM-optimized trace format requires Tempo 2.9+.
Does this modify my cluster or Tempo backend?
No. All 16 tools are read-only. The server only performs HTTP GET requests against Tempo's query APIs. No traces, metrics, or configurations are created, modified, or deleted.Can I use multiple Tempo backends?
Yes. Set theTEMPO_BACKENDS environment variable to a JSON array of backend configurations. Each backend gets its own ID, base URL, tenant settings, and auth header. All tools accept a backend_id parameter to target a specific backend. See .env.example for the format.
How does multi-tenancy work?
For multi-tenant Tempo deployments, setTEMPO_MULTI_TENANT=true and TEMPO_DEFAULT_TENANT. The server injects the X-Scope-OrgID header (configurable via TEMPO_TENANT_HEADER) on every request. Tools accept an optional tenant parameter to override the default. For cross-tenant queries, use pipe-separated values (e.g., tenant="team-a|team-b"). Tenant IDs are validated: max 150 bytes, alphanumeric + !-_.*'().
What is the LLM trace format?
Tempo 2.9+ supports an experimentalapplication/vnd.grafana.llm Accept header that returns traces in a compact, LLM-friendly format — optimized for token efficiency when used with AI assistants. The server attempts this format first and automatically falls back to standard OTLP JSON if the backend doesn't support it. Disable with TEMPO_LLM_FORMAT=false.
Can I use this without Kubernetes?
Yes. SetK8S_ENABLED=false (the default). All tools work against Tempo's HTTP API directly — Kubernetes is only needed for auto-discovery of Tempo backends via service labels or Tempo Operator CRDs. Configure your backend URL(s) via TEMPO_BASE_URL or TEMPO_BACKENDS.
What are query guardrails?
The server enforces configurable safety limits to prevent unbounded queries: time range is required by default (TEMPO_REQUIRE_TIME_RANGE=true), search results are capped (TEMPO_MAX_SEARCH_LIMIT=100), SPSS is bounded (TEMPO_MAX_SPSS=10), and at least one filter or query is required (TEMPO_REQUIRE_FILTER_OR_QUERY=true). These protect both the AI agent's context window and the Tempo backend.
Troubleshooting
Tempo Connection Issues
- Verify
TEMPO_BASE_URLpoints to an accessible Tempo HTTP endpoint (default port:3200). - Load the
tempo://system/backendsresource to check backend health. - Run
tempo_get_diagnostics(backend_id="default")for detailed health analysis. - For Tempo behind a load balancer or gateway, verify the base URL routes to the query-frontend.
- For authenticated backends, set
TEMPO_AUTH_HEADER(e.g.,Bearer <token>).
No Traces Found
- Run
tempo_get_attribute_names(backend_id="default", since="1h")to verify data exists. - Broaden the time range: try
since="24h"orsince="7d". - Start with the broadest possible query:
tempo_traceql_search(backend_id="default", since="24h", limit=5). - For multi-tenant backends, verify the correct
tenantparameter is being passed. - Load the
tempo://runbooks/no-traces-foundresource for a full diagnostic walkthrough. - Check that data is flowing through your ingestion pipeline (OTel Collector → Tempo).
TraceQL Metrics Not Working
- TraceQL metrics require Tempo's metrics-generator with the
local-blocksprocessor enabled. - Run
tempo_get_diagnostics(backend_id="default")to check backend capabilities. - Verify the metrics-generator is configured in your Tempo deployment.
Kubernetes Discovery Not Finding Backends
- Ensure
K8S_ENABLED=truein your.env. - Verify your kubeconfig is accessible and the correct context is set.
- Tempo services must have the label
app.kubernetes.io/name=tempofor label-based discovery. - For Tempo Operator discovery, ensure TempoStack or TempoMonolithic CRDs exist in the cluster.
- For in-cluster deployment, set
K8S_IN_CLUSTER=true.
Diagnostics Reporting False-Positive Ring Errors (404)
- If
tempo_get_diagnosticsreports404 Not Foundfor ring endpoints (e.g.,/distributor/ring,/ingester/ring), yourTEMPO_BASE_URLlikely points to a Tempo Gateway or Query-Frontend in a distributed/microservices deployment. - Gateways generally do not proxy internal diagnostic ring endpoints, which only exist on the specific backend pods.
- Fix: Ensure
TEMPO_DEPLOYMENT_MODE=unknown(the default) is set in your.env. This explicitly instructs the MCP server to gracefully skip ring checks and rely only on/status/services, preventing false-positive degraded health states while still validating core component availability.
Security Considerations
- Never expose the MCP server to the public internet without proper authentication.
- All tools are read-only — the server only performs HTTP GET requests against Tempo's query APIs. No data is created, modified, or deleted.
- Tenant isolation — in multi-tenant deployments, the server injects tenant headers on every request. Verify that tenant IDs are correctly scoped to prevent cross-tenant data leakage.
- Auth headers — if
TEMPO_AUTH_HEADERis set, it is included in every request to the backend. Protect this value as a secret. - Query guardrails — the server enforces time range, limit, and filter requirements to prevent unbounded queries. Review and adjust the policy settings for your environment.
- Kubernetes credentials — when
K8S_ENABLED=true, the server reads Kubernetes service/CRD metadata (read-only). Ensure the service account has minimal RBAC (onlyget,liston Services and Tempo CRDs).
License
Apache 2.0 — see LICENSE.
Contact
TalkOps AI — github.com/talkops-ai
Project: github.com/talkops-ai/talkops-mcp
Discord: Join the community
Acknowledgments
- Model Context Protocol for enabling AI-native tool interfaces.
- FastMCP for the Python MCP server framework.
- Grafana Tempo for the scalable distributed tracing backend.
- Tempo Operator for Kubernetes-native Tempo lifecycle management.
- OpenTelemetry for the industry-standard observability framework.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file talkops_tempo_mcp_server-0.1.5.tar.gz.
File metadata
- Download URL: talkops_tempo_mcp_server-0.1.5.tar.gz
- Upload date:
- Size: 257.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc69a1d6f35c640950f5533ec597919a6622817a42145b415da29326026a37a2
|
|
| MD5 |
9912fe62bbdb740f180f950e087cf311
|
|
| BLAKE2b-256 |
f498a46db4968f8d9a729533ceff3b1473041bd3db6c08f70110da70279812c3
|
Provenance
The following attestation bundles were made for talkops_tempo_mcp_server-0.1.5.tar.gz:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_tempo_mcp_server-0.1.5.tar.gz -
Subject digest:
cc69a1d6f35c640950f5533ec597919a6622817a42145b415da29326026a37a2 - Sigstore transparency entry: 1921218657
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file talkops_tempo_mcp_server-0.1.5-py3-none-any.whl.
File metadata
- Download URL: talkops_tempo_mcp_server-0.1.5-py3-none-any.whl
- Upload date:
- Size: 109.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a06610b3d58db1786015f97548d921ee55a98e5056df40cb8e844d9b71ab0a5e
|
|
| MD5 |
26440e5f575012ee6874474d1afc9644
|
|
| BLAKE2b-256 |
74ad4971a10379f5f66b79f67551e660e38a55ac0dd5aa8237accac1daf9ec32
|
Provenance
The following attestation bundles were made for talkops_tempo_mcp_server-0.1.5-py3-none-any.whl:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_tempo_mcp_server-0.1.5-py3-none-any.whl -
Subject digest:
a06610b3d58db1786015f97548d921ee55a98e5056df40cb8e844d9b71ab0a5e - Sigstore transparency entry: 1921218763
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Trigger Event:
workflow_dispatch
-
Statement type: