talkops-tempo-mcp-server

A Model Context Protocol (MCP) server for Grafana Tempo distributed tracing and observability.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

talkops

These details have not been verified by PyPI

Project links

Homepage

Project description

Tempo MCP Server

An MCP server that gives AI assistants the power to search, analyze, summarize, and correlate distributed traces from Grafana Tempo — with TraceQL query construction, RED metrics analysis, cross-pillar pivots, service topology mapping, and operational diagnostics.

Quick Start · Docs · Report Bug · Request Feature

Why Tempo MCP Server?

The problem: Grafana Tempo is a powerful distributed tracing backend, but effective trace analysis is complex. Constructing TraceQL queries requires knowledge of attribute scopes, intrinsic fields, and structural operators. Correlating traces with metrics (RED analysis) and logs requires multi-step pivots across different APIs. Diagnosing latency spikes means navigating critical paths, identifying root causes, and finding related incidents — each a specialized skill. When AI assistants try to help, they hallucinate TraceQL syntax, miss multi-tenant requirements, or generate unbounded queries that overwhelm backends.

The solution: The Tempo MCP Server gives AI assistants (like Claude, Cline, or Cursor) structured, safe tools to interact with Grafana Tempo natively:

Smart Trace Search: Say "Find error traces from the API service in production" and the AI auto-translates K8s-friendly filters (namespace, service, deployment) into valid TraceQL, enforces query guardrails (time ranges, limits), and returns compact summaries.
Intelligent Trace Analysis: The AI fetches a trace, extracts the critical path, identifies error spans, detects the suspected root cause, and recommends follow-up queries — all in a single tempo_summarize_trace call.
Metrics-First Triage: Execute RED metrics queries (rate, errors, duration) using TraceQL metrics functions like rate(), quantile_over_time(), and count_over_time() — then pivot from aggregated metrics to concrete traces via exemplars.
Cross-Pillar Correlation: Extract trace IDs from log lines and retrieve the full trace. Pivot from metrics spikes to exemplar traces. Correlate related traces using strategies like same-service errors, same-endpoint, or temporal neighbors.
Backend Diagnostics: Aggregate health checks, build info, component service status, and ring member health into a single curated diagnostics report with severity-ranked findings and remediation steps.
Service Topology: Map service dependencies from Tempo's metrics-generator service graph data, with request rates and error rates per edge.

Key Features

TraceQL Search with K8s-Friendly Filters

Raw TraceQL queries or structured K8s filters (namespace, service, deployment, cluster)
Auto-translation of K8s concepts to OTel attributes via canonical mapping
Query guardrails: time range enforcement, limit clamping, empty-query rejection
Basic TraceQL validation before sending to backend
Non-determinism awareness in result metadata

Trace Retrieval & Analysis

Single-trace fetch with LLM-optimized format support (Tempo 2.9+ application/vnd.grafana.llm)
Automatic fallback to standard OTLP JSON when LLM format is unavailable
Server-side trace summarization: critical path extraction, error detection, root cause analysis, and recommended next queries
Time gap detection: disambiguates wall-clock duration from critical path duration when async/disjointed spans inflate the trace window
Related trace correlation with three strategies: same-service errors, same-endpoint, temporal neighbors

Schema Discovery

Attribute name discovery across scopes (resource, span, intrinsic, event, link, instrumentation)
Attribute value enumeration with time-window scoping and TraceQL filtering
Canonical K8s-to-Tempo attribute mapping with optional live validation against a backend

TraceQL Metrics

Range queries returning Prometheus-compatible time series (matrix format)
Instant queries returning point-in-time metrics (vector format)
Support for rate(), count_over_time(), avg_over_time(), max_over_time(), min_over_time(), sum_over_time(), quantile_over_time(), histogram_over_time()

Cross-Pillar Pivots

Metrics-to-traces: extract exemplar trace IDs from TraceQL metrics queries
Logs-to-traces: parse trace IDs from log lines (supports trace_id=, traceId:, TraceID=, standalone 32-char hex) and retrieve full traces

Backend Discovery & Diagnostics

Multi-backend support with per-backend health probing
Kubernetes service discovery (label-based + Tempo Operator CRDs: TempoStack, TempoMonolithic)
Comprehensive diagnostics: readiness, build info, component services, ring status
Severity-ranked findings with actionable remediation steps

Service Topology

Service dependency mapping from traces_service_graph_request_total metrics
Request rate and error rate per service edge
Service-focused filtering for targeted topology views

Multi-Tenancy

Per-backend tenant header injection (X-Scope-OrgID)
Cross-tenant queries via pipe-separated tenant IDs
Tenant ID validation (max 150 bytes, restricted charset)
Graceful handling of single-tenant and multi-tenant backends

Production-Ready Middleware

Response limiting (100KB max), rate limiting (10 req/s, burst 20)
Response caching (10s for tools, 30s for resources, 5min for listings)
Structured logging, error handling, timing

Architecture

                    ┌─────────────────────────┐
                    │     MCP Client          │
                    │ (Claude, Cline, Cursor) │
                    └──────────┬──────────────┘
                               │
                    ┌──────────▼──────────────┐
                    │   FastMCP Server Core   │
                    │  (HTTP / SSE / stdio)   │
                    │  + Middleware Stack      │
                    └──────────┬──────────────┘
                               │
       ┌────────────┬──────────┼──────────┬────────────┐
       │            │          │          │            │
  ┌────▼────┐ ┌────▼────┐ ┌───▼───┐ ┌────▼────┐ ┌────▼────┐
  │  Tools  │ │Resources│ │Prompts│ │  Utils  │ │ Models  │
  │  (16)   │ │  (11)   │ │  (5)  │ │         │ │         │
  └────┬────┘ └────┬────┘ └───────┘ └─────────┘ └─────────┘
       │            │
       └──────┬─────┘
              │
   ┌──────────▼──────────┐
   │    Service Layer     │
   │                      │
   │   tempo_service      │
   │   kubernetes_service │
   └──────────┬──────────┘
              │
   ┌──────────▼──────────┐
   │  Tempo HTTP API      │
   │  + K8s Discovery     │
   └─────────────────────┘

How it works:

An AI assistant connects via HTTP, SSE, or stdio.
The AI loads tempo://system/backends resource to discover available Tempo backends and their health.
Tools interact with Tempo's HTTP API to search traces, compute metrics, and run diagnostics.
The service layer (tempo_service) handles HTTP calls with connection pooling, tenant injection, and LLM format negotiation.
Optional Kubernetes discovery (kubernetes_service) finds Tempo services via labels and Tempo Operator CRDs.
Middleware enforces rate limiting, response size caps, caching, and structured logging.

Why Tempo MCP Server?
Key Features
Architecture
Tech Stack
Getting Started
Configuration
Available Tools
Available Resources
Available Prompts
Usage
Project Structure
Roadmap
Contributing
FAQ
Troubleshooting
Security Considerations
License
Contact
Acknowledgments

Tech Stack

Category	Technologies
Language	Python 3.12+
MCP Framework	FastMCP ≥2.13.3
Protocol	Model Context Protocol (MCP)
Tracing Backend	Grafana Tempo HTTP API
HTTP Client	httpx — async, connection pooling
Kubernetes	Python K8s Client · Tempo Operator CRDs
Transport	HTTP · SSE · Streamable-HTTP · stdio
Infrastructure	Docker · uv

Getting Started

Prerequisites

Docker (recommended) or Python 3.12+ (for local dev)
Grafana Tempo backend accessible via HTTP (monolithic or microservices mode)
Optionally: Kubernetes with the Tempo Operator for auto-discovery

Quick Start with Docker (recommended)

docker run --rm -it \
  -p 8768:8768 \
  -e MCP_TRANSPORT=http \
  -e TEMPO_BASE_URL=http://host.docker.internal:3200 \
  talkopsai/tempo-mcp-server:latest

The server is now listening on http://localhost:8768/mcp.

Point your MCP client at it:

{
  "mcpServers": {
    "tempo": {
      "url": "http://localhost:8768/mcp",
      "description": "MCP Server for Grafana Tempo distributed tracing"
    }
  }
}

From Source (Python)

Install uv for dependency management.
Clone and set up:

git clone https://github.com/talkops-ai/talkops-mcp.git
cd talkops-mcp/src/tempo-mcp-server
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"

Configure your .env:

TEMPO_BASE_URL=http://localhost:3200
MCP_TRANSPORT=http
MCP_LOG_LEVEL=INFO

Run the server:

uv run tempo-mcp-server

Or, with the venv activated: tempo-mcp-server.

Run tests:

source .venv/bin/activate
pytest tests/

Configuration

All configuration is via environment variables (loaded from .env via python-dotenv).

Server Configuration

Variable	Default	Description
`MCP_SERVER_NAME`	`tempo-mcp-server`	Server name identifier
`MCP_SERVER_VERSION`	`0.1.0`	Server version string
`MCP_TRANSPORT`	`stdio`	Transport mode: `http`, `sse`, `streamable-http`, or `stdio`
`MCP_HOST`	`0.0.0.0`	Host address for HTTP server
`MCP_PORT`	`8768`	Port for HTTP server
`MCP_PATH`	`/mcp`	MCP endpoint path
`MCP_LOG_LEVEL`	`INFO`	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`
`MCP_LOG_FORMAT`	`json`	Log format: `json` or `text`
`MCP_HTTP_TIMEOUT`	`300`	HTTP server timeout (seconds)
`MCP_HTTP_KEEPALIVE_TIMEOUT`	`5`	HTTP keepalive timeout (seconds)
`MCP_HTTP_CONNECT_TIMEOUT`	`60`	HTTP connect timeout (seconds)

Tempo Backend (Single Backend Mode)

Variable	Default	Description
`TEMPO_BASE_URL`	`http://localhost:3200`	Tempo HTTP API base URL
`TEMPO_BACKEND_ID`	`default`	Backend identifier
`TEMPO_DISPLAY_NAME`	(empty)	Human-readable backend name
`TEMPO_TYPE`	`tempo`	Backend type: `tempo`, `tempo-gateway`, `unknown`
`TEMPO_DEPLOYMENT_MODE`	`unknown`	Deployment mode: `monolithic`, `microservices`, `unknown`
`TEMPO_AUTH_HEADER`	(empty)	Authorization header value (e.g., `Bearer <token>`)
`TEMPO_VERIFY_SSL`	`true`	Verify SSL certificates
`TEMPO_TIMEOUT`	`30`	HTTP timeout per request (seconds)

Tempo Backend (Multi-Backend Mode)

Variable	Default	Description
`TEMPO_BACKENDS`	(empty)	JSON array of backend configs (overrides single backend). See `.env.example`.

Multi-Tenancy

Variable	Default	Description
`TEMPO_MULTI_TENANT`	`false`	Enable multi-tenant mode for the backend
`TEMPO_DEFAULT_TENANT`	(empty)	Default tenant ID (required if `TEMPO_MULTI_TENANT=true`)
`TEMPO_TENANT_HEADER`	`X-Scope-OrgID`	HTTP header name for tenant ID injection

Query Policies / Guardrails

Variable	Default	Description
`TEMPO_MAX_LOOKBACK`	`168h`	Maximum query lookback (7 days)
`TEMPO_DEFAULT_SEARCH_LIMIT`	`20`	Default max traces per search
`TEMPO_MAX_SEARCH_LIMIT`	`100`	Absolute max traces per search
`TEMPO_DEFAULT_SPSS`	`3`	Default spans per span-set
`TEMPO_MAX_SPSS`	`10`	Maximum spans per span-set
`TEMPO_REQUIRE_TIME_RANGE`	`true`	Require time range on searches
`TEMPO_REQUIRE_FILTER_OR_QUERY`	`true`	Require at least one filter or TraceQL query
`TEMPO_DEFAULT_METRICS_SAMPLING`	(empty)	Default metrics sampling rate (e.g., `fixed-span:0.1`)
`TEMPO_MAX_METRICS_DURATION`	`3h`	Maximum allowed metrics query time range. Should match Tempo's `query_frontend.metrics.max_duration`.

LLM Format

Variable	Default	Description
`TEMPO_LLM_FORMAT`	`true`	Enable LLM-optimized trace format (Tempo 2.9+ `application/vnd.grafana.llm`)

Kubernetes Discovery

Variable	Default	Description
`K8S_ENABLED`	`false`	Enable Kubernetes-based Tempo backend discovery
`K8S_CONTEXT`	(empty)	Specific kubeconfig context to use
`K8S_IN_CLUSTER`	`false`	Set `true` when running inside a Kubernetes pod

Tempo Operator CRD

Variable	Default	Description
`TEMPO_CRD_GROUP`	`tempo.grafana.com`	Tempo Operator CRD API group
`TEMPO_CRD_API_VERSION`	`v1alpha1`	CRD API version (change when Operator graduates to v1)

Available Tools

Discovery

Tool	Description
`tempo_list_backends`	List all configured Tempo backends with health status (ready/not_ready). Use this first to discover available backends.
`tempo_get_backend`	Get detailed profile for a specific backend: health, version, build info, capabilities, deployment mode, tenant requirements, and component service statuses.
`tempo_get_query_policies`	Get query guardrails and default search parameters: max lookback, search limits, SPSS limits, and time range requirements.

Schema Discovery

Tool	Description
`tempo_get_attribute_names`	Discover available trace attribute names from a Tempo backend, grouped by scope (resource, span, intrinsic, event, link, instrumentation). Supports time-window scoping.
`tempo_get_attribute_values`	Get distinct values for a specific trace attribute. Useful for understanding data distribution and building dynamic filters. Supports TraceQL scoping.
`tempo_get_k8s_attribute_map`	Get the canonical mapping between Kubernetes concepts (namespace, pod, deployment) and their OTel/Tempo attribute names. Optionally validates against a live backend's tag list.

Search & Retrieval

Tool	Description
`tempo_traceql_search`	HIGH-INTENT: Search for traces using raw TraceQL or K8s-friendly filters (namespace, service, deployment, cluster, status, duration). Auto-translates filters to TraceQL, enforces query guardrails, and returns compact summaries.
`tempo_get_trace`	Retrieve a single trace by ID with LLM-optimized format. Attempts `application/vnd.grafana.llm` first, falls back to standard OTLP JSON.
`tempo_query_a2ui`	Retrieve a trace heavily optimized and structured for A2UI rendering. DAG-aware pruning enforces payload limits while preserving critical paths and parent-child linkages.
`tempo_summarize_trace`	HIGH-INTENT: Generate an intelligent summary of a trace — critical path extraction, error detection, suspected root cause, K8s context, time gap detection (wall-clock vs. critical path disambiguation), and recommended next queries. Primary analysis primitive.
`tempo_find_related_traces`	HIGH-INTENT: Find traces related to a seed trace using correlation strategies: `same_service_errors`, `same_endpoint`, or `temporal_neighbors`. One call replaces manual multi-step correlation.

Metrics

Tool	Description
`tempo_traceql_metrics_range`	Execute a TraceQL metrics range query. Returns Prometheus-compatible time series (matrix). Use for RED metrics, trend analysis, and SLO calculations. Supports `rate()`, `count_over_time()`, `quantile_over_time()`, etc.
`tempo_traceql_metrics_instant`	Execute a TraceQL metrics instant query. Returns point-in-time metrics (vector).

Cross-Pillar Pivots

Tool	Description
`tempo_get_exemplar_traces`	Pivot from aggregated metrics to concrete traces. Extracts exemplar trace IDs from a TraceQL metrics query result.
`tempo_get_trace_from_log`	Extract a trace ID from a log line (supports multiple formats) and retrieve + summarize the associated trace. One call replaces parse → fetch → analyze.

Diagnostics

Tool	Description
`tempo_get_diagnostics`	HIGH-INTENT: Comprehensive backend diagnostics. Aggregates health check, build info, component service status, and ring member health into a curated report with severity-ranked findings and suggested actions.

Topology

Tool	Description
`tempo_get_service_dependencies`	Map service dependencies from Tempo's metrics-generator service graph data. Returns nodes and edges with request rates. Supports service-focused filtering.

Operator CRD Management

Tool	Description
`tempo_list_operator_crs`	List Tempo Operator custom resources (TempoStack, TempoMonolithic) across namespaces with status. Read-only.
`tempo_get_operator_cr`	Get a Tempo Operator CR with full spec, status, conditions, and storage configuration. Read-only.
`tempo_create_operator_cr`	Create a TempoStack or TempoMonolithic CR. Generates complete CRD manifest with storage, retention, resources. dry_run=True by default.
`tempo_patch_operator_cr`	Patch specific fields of an existing Tempo Operator CR (retention, resources, search). dry_run=True by default.

Trace Comparison

Tool	Description
`tempo_compare_traces`	HIGH-INTENT: Compare two traces and report structural + timing + error + attribute differences. 5-dimensional diff: services, span counts, durations, errors, attributes.

Alerting Expression Generator

Tool	Description
`tempo_generate_alerting_expression`	Generate PromQL alerting expressions from trace patterns using spanmetrics. Returns ready-to-paste PrometheusRule YAML. Cross-MCP workflow: pass output to prom_upsert_rule_group.

Available Resources

Dynamic Resources

Resource URI	Description
`tempo://system/backends`	All configured Tempo backends with health status
`tempo://system/backends/{backend_id}`	Detailed profile for a specific Tempo backend
`tempo://deployment/overview`	Deployment topology: backends, modes, tenants, K8s integration status

Reference Resources (Static)

Resource URI	Description
`tempo://reference/traceql`	TraceQL syntax reference: selectors, operators, intrinsics, scoped attributes, structural queries, examples
`tempo://reference/traceql-metrics`	TraceQL metrics functions: rate, count_over_time, quantile, histogram, grouping, aggregations, sampling
`tempo://reference/k8s-attributes`	Canonical K8s-to-Tempo attribute mapping for Kubernetes observability
`tempo://reference/query-policies`	Query guardrails, limits, continuation strategy, and safety guidelines (dynamically populated from config)

Runbook Resources

Resource URI	Description
`tempo://runbooks/latency-spike`	Step-by-step runbook for investigating latency spikes: detect → locate → analyze → correlate → root cause
`tempo://runbooks/error-burst`	Step-by-step runbook for investigating error bursts: quantify → search → triage → correlate
`tempo://runbooks/no-traces-found`	Diagnostic runbook for "no traces found" scenarios: backend health → data existence → scope checks → ingestion
`tempo://runbooks/cross-tenant-access`	Runbook for cross-tenant query configuration, usage, and constraints

Example Resources

Resource URI	Description
`tempo://examples/common-queries`	Common TraceQL and metrics query examples for quick reference: service exploration, error investigation, performance analysis, structural queries, metrics queries

Available Prompts

Guided workflow prompts that orchestrate multiple tools into step-by-step journeys:

Prompt Name	Description	Parameters
`tempo-error-triage`	Guided 4-phase error triage: quantify impact (error rate vs. baseline), find error traces, analyze root cause via summarization + correlation, contextualize with diagnostics	`backend_id`, `service`, `namespace`
`tempo-latency-investigation`	Guided 4-phase latency investigation: confirm spike (P99 trend), find slow traces above threshold, critical path analysis via summarization, compare with normal traces	`backend_id`, `service`, `threshold_ms`
`tempo-missing-traces`	Guided 4-phase diagnostic for "no traces found": verify backend health, verify data exists (attribute names, broadest search), check scope (tenant, namespace, service), consult runbook	`backend_id`, `service`
`tempo-traceql-builder`	Interactive TraceQL query construction: parse user intent, discover available attributes, construct query using reference, execute, and refine	`backend_id`, `intent`
`tempo-metrics-first-triage`	RED metrics-first triage for a service: rate, error rate, P99 duration, investigate anomalies, deep dive into individual traces	`backend_id`, `service`

Usage

Supported workflows with prompt examples and links to detailed guides:

Workflow	Prompt Example	Documentation
Error Triage	`"Triage errors for the 'checkout-service' in the 'production' namespace using backend 'prod'."`	TEMPO_ERROR_TRIAGE_TEST_GUIDE.md
Latency Investigation	`"Investigate latency spikes above 500ms for 'api-gateway' using backend 'prod'."`	TEMPO_LATENCY_INVESTIGATION_TEST_GUIDE.md
Missing Traces	`"No traces found for 'payment-service' — diagnose the issue on backend 'prod'."`	TEMPO_MISSING_TRACES_TEST_GUIDE.md
TraceQL Builder	`"Build a TraceQL query to find slow database calls over 100ms in the frontend."`	TEMPO_TRACEQL_BUILDER_TEST_GUIDE.md
Metrics-First Triage	`"Run a RED analysis for 'order-service' over the last 6 hours."`	TEMPO_METRICS_FIRST_TRIAGE_TEST_GUIDE.md

Project Structure

tempo-mcp-server/
├── tempo_mcp_server/              # Main package
│   ├── tools/                     # MCP Tools (10 tool groups, 23 tools)
│   │   ├── discovery/             # Backend listing, inspection, query policies
│   │   │   └── discovery_tools.py # 3 tools: list_backends, get_backend, get_query_policies
│   │   ├── schema/                # Attribute/tag discovery
│   │   │   └── schema_tools.py    # 3 tools: get_attribute_names, get_attribute_values, get_k8s_attribute_map
│   │   ├── search/                # Trace search & retrieval
│   │   │   └── search_tools.py    # 5 tools: traceql_search, get_trace, query_a2ui, summarize_trace, find_related_traces
│   │   ├── metrics/               # TraceQL metrics queries
│   │   │   └── metrics_tools.py   # 2 tools: metrics_range, metrics_instant
│   │   ├── pivot/                 # Cross-pillar correlation
│   │   │   └── pivot_tools.py     # 2 tools: get_exemplar_traces, get_trace_from_log
│   │   ├── diagnostics/           # Backend health & diagnostics
│   │   │   └── diagnostics_tools.py # 1 tool: get_diagnostics
│   │   ├── topology/              # Service dependency mapping
│   │   │   └── topology_tools.py  # 1 tool: get_service_dependencies
│   │   ├── operator/              # Tempo Operator CRD lifecycle
│   │   │   └── operator_tools.py  # 4 tools: list_operator_crs, get_operator_cr, create_operator_cr, patch_operator_cr
│   │   ├── comparison/            # Trace comparison
│   │   │   └── comparison_tools.py # 1 tool: compare_traces
│   │   └── alerting/              # Alerting expression generation
│   │       └── alerting_tools.py  # 1 tool: generate_alerting_expression
│   ├── resources/                 # MCP Resources (11 URIs)
│   │   ├── backend_resources.py   # Dynamic: backends listing, backend detail
│   │   ├── deployment_resources.py # Dynamic: deployment overview
│   │   ├── reference_resources.py # Static: TraceQL, metrics, K8s attributes, query policies
│   │   ├── runbook_resources.py   # Static: latency spike, error burst, no traces, cross-tenant
│   │   └── examples_resources.py  # Static: common TraceQL query examples
│   ├── prompts/                   # MCP Prompts (5 guided workflows)
│   │   ├── query_prompts.py       # TraceQL builder, metrics-first triage
│   │   └── troubleshooting_prompts.py # Error triage, latency investigation, missing traces
│   ├── services/                  # Business logic
│   │   ├── tempo_service.py       # Async HTTP client: all Tempo API calls, tenant injection,
│   │   │                          # LLM format negotiation, connection pooling
│   │   └── kubernetes_service.py  # K8s discovery & CRD management: service labels, Tempo Operator CRDs,
│   │                              # create/patch TempoStack/TempoMonolithic
│   ├── server/                    # FastMCP server setup
│   │   ├── core.py                # Server creation & instructions loading
│   │   ├── bootstrap.py           # Component initialization & DI
│   │   └── middleware.py          # 7-layer middleware stack
│   ├── models/                    # Pydantic data models
│   │   ├── search.py              # SearchFilters, trace response models
│   │   ├── schema.py              # Attribute scope definitions
│   │   ├── backend.py             # Backend config models
│   │   ├── trace.py               # Trace summary models
│   │   ├── metrics.py             # Metrics response models
│   │   ├── pivot.py               # Pivot response models
│   │   ├── topology.py            # Topology models
│   │   ├── diagnostics.py         # Diagnostics models
│   │   ├── operator.py            # Tempo Operator CRD models
│   │   └── comparison.py          # Trace comparison models
│   ├── utils/                     # Helpers
│   │   ├── traceql_helpers.py     # TraceQL construction, validation, K8s attribute mapping
│   │   ├── trace_summarizer.py    # Critical path extraction, error detection, headline generation
│   │   ├── trace_differ.py        # 5-dimensional trace diff engine
│   │   ├── trace_id_extractor.py  # Regex-based trace ID parsing from log lines
│   │   └── time_helpers.py        # Relative time parsing (1h, 24h, 7d → Unix epoch)
│   ├── static/                    # Static data files
│   │   └── TEMPO_MCP_INSTRUCTIONS.md  # MCP system instructions for AI agents
│   ├── exceptions/                # Custom exception hierarchy
│   │   └── custom.py              # TempoOperationError, TempoQueryError, TempoTenantError, etc.
│   ├── config.py                  # Environment parsing & config dataclasses
│   └── main.py                    # Entry point & CLI
├── tests/                         # Test suites
│   ├── unit/                      # Unit tests (deterministic, mocked)
│   ├── integration/               # In-memory MCP integration tests
│   ├── fixtures/                  # Test fixtures (JSON responses)
│   └── conftest.py                # Shared test configuration
├── docs/                          # Documentation & test guides
├── pyproject.toml                 # Package definition (Python 3.12)
├── Dockerfile                     # Docker build
└── README.md                      # This documentation

Roadmap

Shipped in this release:

TraceQL search with K8s-friendly filters and query guardrails
Intelligent trace summarization (critical path, error detection, root cause)
Related trace discovery via correlation strategies
Attribute name/value discovery with scope filtering and time-window scoping
K8s-to-Tempo canonical attribute mapping with live validation
TraceQL metrics: range and instant queries with Prometheus-compatible output
Metrics-to-traces exemplar pivot
Logs-to-traces pivot (multi-format trace ID extraction)
Comprehensive backend diagnostics (readiness, build info, services, rings)
Service topology mapping from metrics-generator data
Multi-tenancy with tenant validation and cross-tenant support
5 guided workflow prompts (error triage, latency, missing traces, TraceQL builder, RED triage)
11 MCP resources (dynamic backends, static references, runbooks, examples)
7-layer middleware stack (error handling, response limiting, rate limiting, caching, logging, timing)
Tempo Operator CRD management (list/get/create/patch TempoStack & TempoMonolithic)
Trace comparison (diff two traces by ID — 5-dimensional structural analysis)
Alerting expression generator (PromQL from trace patterns → cross-MCP workflow with Prometheus server)

Coming next:

Multi-cluster support
Trace diff visualization (HTML/Mermaid output for trace comparison)
Batch trace analysis (compare N traces, detect outliers)
Custom TraceQL metrics function library

See open issues for the full list of proposed features.

Contributing

Contributions are welcome. The process is straightforward:

Fork the repo
Create a branch (git checkout -b feature/TraceComparison)
Make your changes and commit
Push and open a PR

If you're considering something bigger, open an issue first so we can align on the approach.

FAQ

Which MCP clients work with this?

Any MCP-compatible client including Claude Desktop, Cline, Cursor, and custom clients. Connect via http://localhost:8768/mcp for HTTP transport, or configure stdio for direct process communication.

Does this require Grafana Tempo?

Yes. The server communicates with Tempo's HTTP API (/api/search, /api/v2/traces/{traceID}, /api/v2/search/tags, /api/metrics/query_range, etc.). Any Grafana Tempo deployment (monolithic, microservices, or via the Tempo Operator) will work. The LLM-optimized trace format requires Tempo 2.9+.

Does this modify my cluster or Tempo backend?

No. All 16 tools are read-only. The server only performs HTTP GET requests against Tempo's query APIs. No traces, metrics, or configurations are created, modified, or deleted.

Can I use multiple Tempo backends?

Yes. Set the TEMPO_BACKENDS environment variable to a JSON array of backend configurations. Each backend gets its own ID, base URL, tenant settings, and auth header. All tools accept a backend_id parameter to target a specific backend. See .env.example for the format.

How does multi-tenancy work?

For multi-tenant Tempo deployments, set TEMPO_MULTI_TENANT=true and TEMPO_DEFAULT_TENANT. The server injects the X-Scope-OrgID header (configurable via TEMPO_TENANT_HEADER) on every request. Tools accept an optional tenant parameter to override the default. For cross-tenant queries, use pipe-separated values (e.g., tenant="team-a|team-b"). Tenant IDs are validated: max 150 bytes, alphanumeric + !-_.*'().

What is the LLM trace format?

Tempo 2.9+ supports an experimental application/vnd.grafana.llm Accept header that returns traces in a compact, LLM-friendly format — optimized for token efficiency when used with AI assistants. The server attempts this format first and automatically falls back to standard OTLP JSON if the backend doesn't support it. Disable with TEMPO_LLM_FORMAT=false.

Can I use this without Kubernetes?

Yes. Set K8S_ENABLED=false (the default). All tools work against Tempo's HTTP API directly — Kubernetes is only needed for auto-discovery of Tempo backends via service labels or Tempo Operator CRDs. Configure your backend URL(s) via TEMPO_BASE_URL or TEMPO_BACKENDS.

What are query guardrails?

The server enforces configurable safety limits to prevent unbounded queries: time range is required by default (TEMPO_REQUIRE_TIME_RANGE=true), search results are capped (TEMPO_MAX_SEARCH_LIMIT=100), SPSS is bounded (TEMPO_MAX_SPSS=10), and at least one filter or query is required (TEMPO_REQUIRE_FILTER_OR_QUERY=true). These protect both the AI agent's context window and the Tempo backend.

Troubleshooting

Tempo Connection Issues

Verify TEMPO_BASE_URL points to an accessible Tempo HTTP endpoint (default port: 3200).
Load the tempo://system/backends resource to check backend health.
Run tempo_get_diagnostics(backend_id="default") for detailed health analysis.
For Tempo behind a load balancer or gateway, verify the base URL routes to the query-frontend.
For authenticated backends, set TEMPO_AUTH_HEADER (e.g., Bearer <token>).

No Traces Found

Run tempo_get_attribute_names(backend_id="default", since="1h") to verify data exists.
Broaden the time range: try since="24h" or since="7d".
Start with the broadest possible query: tempo_traceql_search(backend_id="default", since="24h", limit=5).
For multi-tenant backends, verify the correct tenant parameter is being passed.
Load the tempo://runbooks/no-traces-found resource for a full diagnostic walkthrough.
Check that data is flowing through your ingestion pipeline (OTel Collector → Tempo).

TraceQL Metrics Not Working

TraceQL metrics require Tempo's metrics-generator with the local-blocks processor enabled.
Run tempo_get_diagnostics(backend_id="default") to check backend capabilities.
Verify the metrics-generator is configured in your Tempo deployment.

Kubernetes Discovery Not Finding Backends

Ensure K8S_ENABLED=true in your .env.
Verify your kubeconfig is accessible and the correct context is set.
Tempo services must have the label app.kubernetes.io/name=tempo for label-based discovery.
For Tempo Operator discovery, ensure TempoStack or TempoMonolithic CRDs exist in the cluster.
For in-cluster deployment, set K8S_IN_CLUSTER=true.

Diagnostics Reporting False-Positive Ring Errors (404)

If tempo_get_diagnostics reports 404 Not Found for ring endpoints (e.g., /distributor/ring, /ingester/ring), your TEMPO_BASE_URL likely points to a Tempo Gateway or Query-Frontend in a distributed/microservices deployment.
Gateways generally do not proxy internal diagnostic ring endpoints, which only exist on the specific backend pods.
Fix: Ensure TEMPO_DEPLOYMENT_MODE=unknown (the default) is set in your .env. This explicitly instructs the MCP server to gracefully skip ring checks and rely only on /status/services, preventing false-positive degraded health states while still validating core component availability.

Security Considerations

Never expose the MCP server to the public internet without proper authentication.
All tools are read-only — the server only performs HTTP GET requests against Tempo's query APIs. No data is created, modified, or deleted.
Tenant isolation — in multi-tenant deployments, the server injects tenant headers on every request. Verify that tenant IDs are correctly scoped to prevent cross-tenant data leakage.
Auth headers — if TEMPO_AUTH_HEADER is set, it is included in every request to the backend. Protect this value as a secret.
Query guardrails — the server enforces time range, limit, and filter requirements to prevent unbounded queries. Review and adjust the policy settings for your environment.
Kubernetes credentials — when K8S_ENABLED=true, the server reads Kubernetes service/CRD metadata (read-only). Ensure the service account has minimal RBAC (only get, list on Services and Tempo CRDs).

License

Apache 2.0 — see LICENSE.

Contact

TalkOps AI — github.com/talkops-ai

Project: github.com/talkops-ai/talkops-mcp

Discord: Join the community

Acknowledgments

Model Context Protocol for enabling AI-native tool interfaces.
FastMCP for the Python MCP server framework.
Grafana Tempo for the scalable distributed tracing backend.
Tempo Operator for Kubernetes-native Tempo lifecycle management.
OpenTelemetry for the industry-standard observability framework.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

talkops

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.5

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

talkops_tempo_mcp_server-0.1.5.tar.gz (257.7 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

talkops_tempo_mcp_server-0.1.5-py3-none-any.whl (109.4 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file talkops_tempo_mcp_server-0.1.5.tar.gz.

File metadata

Download URL: talkops_tempo_mcp_server-0.1.5.tar.gz
Upload date: Jun 23, 2026
Size: 257.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talkops_tempo_mcp_server-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`cc69a1d6f35c640950f5533ec597919a6622817a42145b415da29326026a37a2`
MD5	`9912fe62bbdb740f180f950e087cf311`
BLAKE2b-256	`f498a46db4968f8d9a729533ceff3b1473041bd3db6c08f70110da70279812c3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talkops_tempo_mcp_server-0.1.5.tar.gz:

Publisher: release-pypi.yml on talkops-ai/talkops-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talkops_tempo_mcp_server-0.1.5.tar.gz
- Subject digest: cc69a1d6f35c640950f5533ec597919a6622817a42145b415da29326026a37a2
- Sigstore transparency entry: 1921218657
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Branch / Tag: refs/heads/main
- Owner: https://github.com/talkops-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Trigger Event: workflow_dispatch

File details

Details for the file talkops_tempo_mcp_server-0.1.5-py3-none-any.whl.

File metadata

Download URL: talkops_tempo_mcp_server-0.1.5-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 109.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talkops_tempo_mcp_server-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a06610b3d58db1786015f97548d921ee55a98e5056df40cb8e844d9b71ab0a5e`
MD5	`26440e5f575012ee6874474d1afc9644`
BLAKE2b-256	`74ad4971a10379f5f66b79f67551e660e38a55ac0dd5aa8237accac1daf9ec32`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talkops_tempo_mcp_server-0.1.5-py3-none-any.whl:

Publisher: release-pypi.yml on talkops-ai/talkops-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talkops_tempo_mcp_server-0.1.5-py3-none-any.whl
- Subject digest: a06610b3d58db1786015f97548d921ee55a98e5056df40cb8e844d9b71ab0a5e
- Sigstore transparency entry: 1921218763
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Branch / Tag: refs/heads/main
- Owner: https://github.com/talkops-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Trigger Event: workflow_dispatch

talkops-tempo-mcp-server 0.1.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Tempo MCP Server

Why Tempo MCP Server?

Key Features

Architecture

Table of Contents

Tech Stack

Getting Started

Prerequisites

Quick Start with Docker (recommended)

From Source (Python)

Configuration

Server Configuration

Tempo Backend (Single Backend Mode)

Tempo Backend (Multi-Backend Mode)

Multi-Tenancy

Query Policies / Guardrails

LLM Format

Kubernetes Discovery

Tempo Operator CRD

Available Tools

Discovery

Schema Discovery

Search & Retrieval

Metrics

Cross-Pillar Pivots

Diagnostics

Topology

Operator CRD Management

Trace Comparison

Alerting Expression Generator

Available Resources

Dynamic Resources

Reference Resources (Static)

Runbook Resources

Example Resources

Available Prompts

Usage

Project Structure

Roadmap

Contributing

FAQ

Troubleshooting

Tempo Connection Issues

No Traces Found

TraceQL Metrics Not Working

Kubernetes Discovery Not Finding Backends

Diagnostics Reporting False-Positive Ring Errors (404)

Security Considerations

License

Contact

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details