talkops-opentelemetry-mcp-server

A Model Context Protocol (MCP) server for OpenTelemetry Kubernetes observability.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

talkops

These details have not been verified by PyPI

Project links

Homepage

Project description

OpenTelemetry MCP Server

An MCP server that gives AI assistants the power to discover, provision, instrument, validate, and govern OpenTelemetry pipelines on Kubernetes — from intent-driven collector provisioning to cardinality control, sampling optimization, and security auditing.

Quick Start · Docs · Report Bug · Request Feature

Why OpenTelemetry MCP Server?

The problem: OpenTelemetry is the industry standard for observability, but operating it on Kubernetes is complex. Setting up auto-instrumentation across languages, configuring collector pipelines with the correct processor ordering, tuning sampling strategies, and managing cardinality from SpanMetrics connectors — each is a specialized task. When AI assistants try to help, they hallucinate CRD schemas, mis-order processors, or generate unsafe configs that cause data loss.

The solution: The OpenTelemetry MCP Server gives AI assistants (like Claude, Cline, or Cursor) structured, safe tools to manage the entire OTel lifecycle natively:

Intent-Driven Collector Provisioning: Say "I want traces and metrics in my namespace" and the AI auto-discovers backends (Jaeger, Tempo, Prometheus, Loki, OpenSearch), generates best-practice configs with correct processor ordering, selects the right deployment mode, sizes resources for your cluster — and deploys with dry-run-first safety.
Zero-to-Instrumented Onboarding: The AI looks up language support, creates an Instrumentation CRD, annotates Deployments for auto-instrumentation injection, and verifies the rollout — all with dry-run-first safety.
Pipeline Investigation & Validation: Deep-inspect any OTel Collector's config, validate processor ordering against best practices, audit filelog receiver safety, and check k8sattributes enrichment profiles.
Metric Cardinality Governance: Detect high-cardinality dimensions from SpanMetrics connectors, generate transform processor YAML to drop attributes, and estimate series counts before they explode.
Sampling Strategy Optimization: Cross-reference head sampling (Instrumentation CRDs) with tail sampling (collector config), detect conflicts, and generate config patches to switch strategies.
Security Posture Auditing: Scan eBPF instrumentation pods for privileged mode, SYS_ADMIN capabilities, and hostPID access. Risk-assess the entire observability footprint.

Key Features

Intent-Driven Collector Provisioning

Express what you want (signals, namespace) — the tool auto-discovers everything else
Three-strategy backend discovery: existing collector configs → K8s service name matching → graceful debug fallback
10 built-in backend patterns: Jaeger, Tempo, Zipkin, Prometheus, Thanos, Mimir, VictoriaMetrics, OpenSearch, Elasticsearch, Loki
Best-practice processor chain always enforced: memory_limiter → k8sattributes → resourcedetection → resource → batch
Smart mode selection: DaemonSet for filelog, StatefulSet for Prometheus scraping, Deployment for OTLP
Auto-sizing from cluster scale (node count → small/medium/large resource tiers)
Filelog safety built-in: self-exclusion patterns, namespace scoping, checkpoint storage, start_at=end
SpanMetrics connector wiring with correct traces→connector→metrics pipeline topology
Proactive recommendations ("Consider spanmetrics for RED metrics", "Filelog needs DaemonSet")

Collector Discovery & Inspection

List and inspect OpenTelemetryCollector CRDs across namespaces with pagination
Full pipeline topology: receivers, processors, exporters, and connectors
Summary and full (raw YAML) detail levels

Service Instrumentation

Language support matrix with framework-specific guidance (Java, Python, Node.js, .NET, Go, Rust)
Create or patch Instrumentation CRDs with sampler, propagators, and per-language images
Annotate Deployments for auto-instrumentation with dry-run-first safety
List instrumented services with annotation status, init container injection, signal detection, and 4-tier language detection (annotations → image patterns → container/deployment names → runtime env vars)

Pipeline Validation

Processor ordering validation (memory_limiter → k8sattributes → batch)
Filelog receiver safety checks (checkpoint storage, self-collection loops, resource detection)
Target Allocator state inspection (allocation strategy, selectors, prometheusCR)
Collector topology recommendations based on signals, workload count, and cluster size

Metric Cardinality Governance

SpanMetrics dimension analysis with series count estimation
Histogram bucket count auditing
Transform processor YAML generation for dropping attributes
Existing remediation detection (transform processors already in config)

Sampling Management

Holistic sampling view: head (Instrumentation CRD) + tail (collector config)
Conflict detection (head + tail simultaneously)
Config patch generation for switching between head, tail, or none
Tail sampling policy templates (error-sampling, slow-traces, probabilistic-fallback)

SpanMetrics Connector

Inspect existing SpanMetrics configuration (dimensions, histograms, pipeline wiring)
Generate SpanMetrics enablement YAML with custom dimensions and bucket boundaries
Cardinality warnings for high-dimension configurations

Security Auditing

eBPF agent discovery (OpenTelemetry eBPF, Grafana Beyla)
Security context analysis: privileged mode, hostPID, capabilities, host volume mounts
Risk assessment with prioritized remediation recommendations

Production-Ready Middleware

Response limiting (100KB max), rate limiting (10 req/s, burst 20)
Response caching, structured logging, error handling, timing

Architecture

                    ┌─────────────────────────┐
                    │     MCP Client          │
                    │ (Claude, Cline, Cursor) │
                    └──────────┬──────────────┘
                               │
                    ┌──────────▼──────────────┐
                    │   FastMCP Server Core   │
                    │  (HTTP / SSE / stdio)   │
                    │  + Middleware Stack      │
                    └──────────┬──────────────┘
                               │
      ┌────────────┬───────────┼───────────┬────────────┐
      │            │           │           │            │
 ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
 │  Tools  │ │Resources│ │ Prompts │ │  Utils  │ │ Models  │
 │ (19)    │ │ (9)     │ │ (5)     │ │         │ │         │
 └────┬────┘ └────┬────┘ └─────────┘ └─────────┘ └─────────┘
      │            │
      └──────┬─────┘
             │
  ┌──────────▼──────────┐
  │    Service Layer     │
  │                      │
  │ kubernetes_service   │
  │ collector_config_svc │
  │ config_builder       │
  └──────────┬──────────┘
             │
  ┌──────────▼──────────┐
  │ Python K8s Client   │
  │ + OTel Operator CRDs│
  └─────────────────────┘

How it works:

An AI assistant connects via HTTP, SSE, or stdio.
The AI loads otel://system/health resource to check Kubernetes connectivity and CRD availability.
Tools interact with the Kubernetes API to read/write OpenTelemetryCollector and Instrumentation CRDs.
Service layers (kubernetes_service, collector_config_service) handle API calls and config parsing.
Middleware enforces rate limiting, response size caps, caching, and structured logging.

Why OpenTelemetry MCP Server?
Key Features
Architecture
Tech Stack
Getting Started
Configuration
Available Tools
Available Resources
Available Prompts
Usage
Project Structure
Roadmap
Contributing
FAQ
Troubleshooting
Security Considerations
License
Contact
Acknowledgments

Tech Stack

Category	Technologies
Language	Python 3.12+
MCP Framework	FastMCP ≥2.13.3
Protocol	Model Context Protocol (MCP)
OpenTelemetry	Operator CRDs · Instrumentation CRDs · Collector config
Kubernetes	Python K8s Client · Custom Resources · RBAC
Transport	HTTP · SSE · Streamable-HTTP · stdio
Infrastructure	Docker · uv

Getting Started

Prerequisites

Docker (recommended) or Python 3.12+ (for local dev)
Kubernetes cluster with the OpenTelemetry Operator installed
kubectl configured with access to the target cluster

RBAC Note for Collector Provisioning (otel_provision_collector)

When provisioning collectors with dry_run=False, the tool automatically creates ClusterRole and ClusterRoleBinding resources for the k8sattributes processor. This is necessary because:

The k8sattributes processor enriches telemetry with Kubernetes metadata (pod name, namespace, node, etc.)

It needs get, list, watch permissions on Pods, ReplicaSets, Namespaces, Nodes, and Jobs

The OTel Operator does NOT auto-create these RBAC resources — without them, the collector will log pods is forbidden errors

Prerequisites for the MCP server's own ServiceAccount:

The ServiceAccount running the MCP server needs permissions to create, patch, get on ClusterRole and ClusterRoleBinding resources

For local development (kubeconfig), your kubectl user typically already has cluster-admin

For in-cluster deployment, add these rules to the MCP server's ServiceAccount:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-mcp-server-rbac-manager
rules:
  - apiGroups: ["rbac.authorization.k8s.io"]
    resources: ["clusterroles", "clusterrolebindings"]
    verbs: ["get", "create", "patch"]
See SUPPORTED_BACKENDS.md for the full list of auto-discoverable backends and default ports.

Quick Start with Docker (recommended)

docker run --rm -it \
  -p 8771:8771 \
  -e MCP_TRANSPORT=http \
  -e K8S_IN_CLUSTER=true \
  talkopsai/opentelemetry-mcp-server:latest

The server is now listening on http://localhost:8771/mcp.

Point your MCP client at it:

{
  "mcpServers": {
    "opentelemetry": {
      "url": "http://localhost:8771/mcp",
      "description": "MCP Server for OpenTelemetry Kubernetes observability"
    }
  }
}

From Source (Python)

Install uv for dependency management.
Clone and set up:

git clone https://github.com/talkops-ai/talkops-mcp.git
cd talkops-mcp/src/opentelemetry-mcp-server
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"

Configure your .env:

MCP_TRANSPORT=http
K8S_ENABLED=true
MCP_LOG_LEVEL=INFO

Run the server:

uv run opentelemetry-mcp-server

Or, with the venv activated: opentelemetry-mcp-server.

Run tests:

source .venv/bin/activate
pytest tests/

Configuration

All configuration is via environment variables (loaded from .env via python-dotenv).

Server Configuration

Variable	Default	Description
`MCP_SERVER_NAME`	`opentelemetry-mcp-server`	Server name identifier
`MCP_SERVER_VERSION`	`0.1.0`	Server version string
`MCP_TRANSPORT`	`stdio`	Transport mode: `http`, `sse`, `streamable-http`, or `stdio`
`MCP_HOST`	`0.0.0.0`	Host address for HTTP server
`MCP_PORT`	`8771`	Port for HTTP server
`MCP_PATH`	`/mcp`	MCP endpoint path
`MCP_LOG_LEVEL`	`INFO`	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`
`MCP_LOG_FORMAT`	`json`	Log format: `json` or `text`
`MCP_HTTP_TIMEOUT`	`300`	HTTP server timeout (seconds)
`MCP_HTTP_KEEPALIVE_TIMEOUT`	`5`	HTTP keepalive timeout (seconds)
`MCP_HTTP_CONNECT_TIMEOUT`	`60`	HTTP connect timeout (seconds)

Kubernetes

Variable	Default	Description
`K8S_IN_CLUSTER`	`false`	Set `true` if running inside a pod
`K8S_ENABLED`	`true`	Enable/disable K8s features entirely

OTel Operator CRD

Variable	Default	Description
`OTEL_CRD_GROUP`	`opentelemetry.io`	API group for OTel CRDs
`OTEL_CRD_API_VERSION`	`v1beta1`	API version for Collector CRDs (`OpenTelemetryCollector`)
`OTEL_INSTRUMENTATION_API_VERSION`	`v1alpha1`	API version for Instrumentation CRDs (separate because Instrumentation CRDs are promoted at a different rate than Collector CRDs)
`OTEL_COLLECTOR_PLURAL`	`opentelemetrycollectors`	Plural name for Collector CRD
`OTEL_INSTRUMENTATION_PLURAL`	`instrumentations`	Plural name for Instrumentation CRD

Target Allocator

Variable	Default	Description
`OTEL_TA_SERVICE_DISCOVERY`	`true`	Enable Target Allocator service discovery
`OTEL_TA_DEFAULT_PORT`	`8080`	Default Target Allocator port

Prometheus Integration

Variable	Default	Description
`PROMETHEUS_BASE_URL`	(empty)	Prometheus HTTP API base URL (for cardinality queries)
`PROMETHEUS_TIMEOUT`	`30`	HTTP timeout for Prometheus API calls (seconds)
`PROMETHEUS_VERIFY_SSL`	`true`	Verify SSL certificates

Language Registry

Variable	Default	Description
`OTEL_LANG_REGISTRY_PATH`	(builtin)	Override path to custom language registry JSON

Available Tools

Discovery

Tool	Description
`otel_list_collectors`	List OpenTelemetryCollector CRDs with namespace filtering, label selectors, and pagination.
`otel_query_a2ui`	Retrieve the status of all OpenTelemetry Collectors and their pipelines, formatted precisely for A2UI Status Datatables. Includes deep pipeline metrics health checking via internal `:8888/metrics` endpoints.
`otel_get_collector`	Get detailed information about a specific collector (pipelines, status, raw YAML config).
`otel_list_instrumented_services`	List workloads in a namespace with their auto-instrumentation status, annotations, init containers, OTEL_* env vars, and detected language (4-tier: annotations → image patterns → container names → runtime env vars like `JAVA_HOME`, `PYTHONPATH`).

Collector Management

Tool Description

otel_provision_collector Smart, intent-driven collector provisioning. Accepts namespace + signals (e.g., ["traces", "metrics"]), auto-discovers backend endpoints from existing collectors and K8s services, generates best-practice configs with correct processor ordering, selects deployment mode, and sizes resources from cluster scale. Automatically creates RBAC (ClusterRole + ClusterRoleBinding) for the k8sattributes processor. Supports enable_spanmetrics, enable_filelog, prometheus_scrape, and dry_run modes. See Supported Backends for auto-discoverable backends.

otel_patch_collector Expert-level CRD management. Create or replace an OpenTelemetryCollector CRD with full config YAML, dynamic labels, annotations, and spec. Supports overwrite (full replace with resourceVersion) and dry_run modes.

Tool	Description
`otel_provision_collector`	Smart, intent-driven collector provisioning. Accepts `namespace` + `signals` (e.g., `["traces", "metrics"]`), auto-discovers backend endpoints from existing collectors and K8s services, generates best-practice configs with correct processor ordering, selects deployment mode, and sizes resources from cluster scale. Automatically creates RBAC (ClusterRole + ClusterRoleBinding) for the k8sattributes processor. Supports `enable_spanmetrics`, `enable_filelog`, `prometheus_scrape`, and `dry_run` modes. See Supported Backends for auto-discoverable backends.
`otel_patch_collector`	Expert-level CRD management. Create or replace an OpenTelemetryCollector CRD with full config YAML, dynamic labels, annotations, and spec. Supports `overwrite` (full replace with resourceVersion) and `dry_run` modes.

Instrumentation

Tool	Description
`otel_lookup_instrumentation`	Map a language and optional framework to OTel instrumentation support (auto-instrumentation availability, annotation key, SDK package).
`otel_patch_instrumentation`	Create or patch an Instrumentation CRD with exporter endpoint, propagators, sampler, and per-language images. Supports `dry_run`.
`otel_annotate_deployment`	Apply auto-instrumentation annotation to a Deployment's pod template. *Detects conflicting hardcoded `OTEL_` env vars** (e.g., `OTEL_EXPORTER_OTLP_ENDPOINT`) that would silently override Operator-injected endpoints, and warns with remediation steps. Supports `dry_run`. Triggers rolling restart when applied.

Validation

Tool	Description
`otel_validate_k8sattributes_order`	Validate processor ordering in collector pipelines against recommended order (memory_limiter → k8sattributes → resourcedetection → transform → filter → tail_sampling → batch).
`otel_check_filelog_safety`	Check filelog receiver for safety issues: missing checkpoint storage, self-collection feedback loops, and missing resource detection.
`otel_inspect_target_allocator_state`	Inspect Target Allocator configuration: allocation strategy, ServiceMonitor/PodMonitor selectors, prometheusCR enablement, replicas.
`otel_recommend_collector_topology`	Recommend collector deployment mode (DaemonSet/Deployment/Gateway), pipeline topology, and resource sizing based on signals, workload count, and cluster size.

Governance

Tool	Description
`otel_detect_cardinality`	Detect metric cardinality issues from SpanMetrics dimensions and histogram buckets. Returns estimated series counts and severity ratings.
`otel_gen_drop_attribute_rules`	Generate transform processor YAML snippet to drop high-cardinality attributes for metrics, traces, or logs signals.
`otel_analyze_ebpf_footprint`	Scan eBPF instrumentation pods for security posture: privileged mode, hostPID, Linux capabilities, and host volume mounts.

Sampling

Tool	Description
`otel_inspect_sampling_configuration`	Inspect complete sampling config: cross-references head sampling (Instrumentation CRD) with tail sampling (collector config). Detects conflicts.
`otel_toggle_sampling_strategy`	Generate config patches to switch between head, tail, or no sampling. Includes tail sampling policy templates. Supports `dry_run`.

SpanMetrics

Tool	Description
`otel_inspect_spanmetrics_config`	Inspect SpanMetrics connector configuration: dimensions, histogram config, pipeline wiring, and cardinality estimates.
`otel_enable_spanmetrics_for_service`	Generate SpanMetrics connector YAML with custom dimensions, histogram buckets, and pipeline wiring instructions. Supports `dry_run`.

Available Resources

Resource URI	Description
`otel://system/health`	Server health status: Kubernetes connectivity, OTel CRD availability, and server version
`otel://collector/{namespace}/{name}`	Full collector config: pipeline topology, receivers, processors, exporters, deployment mode, and status
`otel://k8s-enrichment/{namespace}/{collector}`	k8sattributes processor profile: extracted metadata, labels, annotations, pod association, and pipeline positions
`otel://logs-profile/{namespace}/{collector}`	Filelog receiver config: include/exclude paths, operators, safety analysis, and pipeline wiring
`otel://spanmetrics/{namespace}/{collector}`	SpanMetrics connector profile: dimensions, histogram config, pipeline wiring, and cardinality estimates
`otel://instrumentation/{namespace}/{name}`	Instrumentation CRD details: exporter endpoint, propagators, sampler, per-language specs, and resource attributes
`otel://target-allocator/{namespace}/{name}`	Target Allocator state: allocation strategy, ServiceMonitor/PodMonitor selectors, replicas, and prometheusCR status
`otel://lang/{language}`	Per-language instrumentation capabilities: signal support, auto-instrumentation availability, framework support, SDK package
`otel://registry/languages`	Full catalog of all supported languages with signal stability, auto-instrumentation, and framework support matrices

Available Prompts

Guided workflow prompts that orchestrate multiple tools into step-by-step journeys:

Prompt Name	Description	Parameters
`otel_onboard_service`	Guided workflow for onboarding a new service to OpenTelemetry: language detection, Instrumentation CR setup, annotation application, and verification	`service_name`, `namespace`, `language`
`otel_investigate_pipeline`	Guided workflow for investigating an OTel pipeline: processor ordering, filelog safety, sampling config, and enrichment profile	`collector_name`, `namespace`
`otel_cardinality_audit`	Guided workflow for auditing metric cardinality: detect high-cardinality dimensions and generate transform processor remediation YAML	`collector_name`, `namespace`
`otel_sampling_review`	Guided workflow for reviewing and optimizing sampling strategy across Instrumentation CRDs and collector config	`collector_name`, `namespace`
`otel_security_audit`	Guided workflow for auditing OTel security posture: eBPF privileges, init containers, RBAC, and sensitive attribute exposure	`namespace`

Usage

Supported workflows with prompt examples and links to detailed guides:

Workflow	Prompt Example	Documentation
Service Onboarding	`"Onboard my Python app 'api-server' in the 'production' namespace to OpenTelemetry."`	OTEL_ONBOARDING_TEST_GUIDE.md
Pipeline Investigation	`"Investigate the OTel collector 'otel-gateway' in the 'monitoring' namespace."`	OTEL_PIPELINE_INVESTIGATION_TEST_GUIDE.md
Cardinality Audit	`"Audit metric cardinality for collector 'otel-metrics' in 'monitoring'."`	OTEL_CARDINALITY_AUDIT_TEST_GUIDE.md
Sampling Review	`"Review sampling for collector 'otel-traces' in 'monitoring'."`	OTEL_SAMPLING_TEST_GUIDE.md
Security Audit	`"Audit OTel security posture in the 'production' namespace."`	OTEL_SECURITY_AUDIT_TEST_GUIDE.md

See WORKFLOW_JOURNEYS.md for the full workflow reference and PROMPT_REFERENCE.md for natural-language prompts.

Project Structure

opentelemetry-mcp-server/
├── opentelemetry_mcp_server/      # Main package
│   ├── tools/                     # MCP Tools (7 tool groups, 19 tools)
│   │   ├── discovery/             # Collector & service discovery
│   │   ├── collector/             # Collector CRD management
│   │   │   ├── collector_tools.py # Expert-level CRD create/replace
│   │   │   └── provision_tools.py # Intent-driven smart provisioning (NEW)
│   │   ├── instrumentation/       # Language lookup & CRD management
│   │   ├── validation/            # Pipeline validation & safety checks
│   │   ├── governance/            # Cardinality & eBPF governance
│   │   ├── sampling/              # Sampling inspection & toggle
│   │   └── spanmetrics/           # SpanMetrics connector management
│   ├── resources/                 # MCP Resources (9 URIs)
│   │   └── otel_resources.py      # Collector, enrichment, logs, spanmetrics,
│   │                              # instrumentation, target allocator, language,
│   │                              # and system health resources
│   ├── prompts/                   # MCP Prompts (5 guided workflows)
│   │   └── otel_prompts.py        # Onboarding, investigation, cardinality,
│   │                              # sampling, and security audit prompts
│   ├── services/                  # Business logic
│   │   ├── kubernetes_service.py  # K8s API wrapper (CRDs, Deployments, Pods, Services)
│   │   ├── collector_config_service.py  # Collector YAML parser & analyzer
│   │   └── collector_config_builder.py  # Intent→config generation engine (NEW)
│   ├── server/                    # FastMCP server setup
│   │   ├── core.py                # Server creation
│   │   ├── bootstrap.py           # Component initialization
│   │   └── middleware.py          # 7-layer middleware stack
│   ├── models/                    # Pydantic data models
│   ├── utils/                     # Helpers
│   │   ├── k8s_labels.py          # Annotation keys, eBPF agent labels, language detection
│   │   ├── yaml_helpers.py        # YAML parsing, pipeline extraction
│   │   ├── pagination.py          # Cursor-based pagination
│   │   └── duration.py            # OTel duration string parsing (e.g., "2ms" → 2.0)
│   ├── static/                    # Static data files
│   │   ├── otel_lang_registry.json   # Language support matrix
│   │   └── OTEL_MCP_INSTRUCTIONS.md  # MCP system instructions
│   ├── exceptions/                # Custom exception hierarchy
│   ├── config.py                  # Environment parsing & config dataclasses
│   └── main.py                    # Entry point
├── tests/                         # Test suites (371 tests)
├── docs/                          # Documentation
├── pyproject.toml                 # Package definitions (Python 3.12)
├── Dockerfile                     # Multi-stage Docker build
└── README.md                      # This documentation

Roadmap

Shipped:

Collector discovery with namespace filtering, label selectors, and pagination
Deep collector inspection with full pipeline topology and raw YAML
Language instrumentation lookup with framework-specific guidance
Instrumentation CRD creation/patching with dry-run safety
Deployment annotation for auto-instrumentation injection
Instrumented service listing with annotation and init container detection
Processor ordering validation against OTel best practices
Filelog receiver safety auditing
Target Allocator state inspection
Collector topology recommendation engine
SpanMetrics cardinality analysis and series estimation
Transform processor YAML generation for attribute dropping
eBPF instrumentation security auditing
Sampling configuration inspection (head + tail cross-reference)
Sampling strategy toggle with config patch generation
SpanMetrics connector enablement with custom dimensions
Collector CRD management (create/replace with dry-run safety)
4-tier language detection (annotations → images → names → runtime env vars)
OTel duration string parsing in histogram bucket configs
5 guided workflow prompts for onboarding, investigation, cardinality, sampling, and security
7-layer middleware stack (rate limiting, response limiting, caching)
Intent-driven collector provisioning (otel_provision_collector) — auto-discovery, best-practice configs, smart mode selection
Auto-discovery engine: existing collectors → K8s services → debug fallback
10 backend patterns: Jaeger, Tempo, Zipkin, Prometheus, Thanos, Mimir, VictoriaMetrics, OpenSearch, Elasticsearch, Loki
Cluster auto-sizing (node count → resource recommendations)
SpanMetrics connector wiring with correct pipeline topology
Filelog safety built-in (self-exclusion, namespace scoping, checkpoints)

Coming next:

Collector config diffing (before/after patch comparison)
Alerting rule integration (OTel → PrometheusRule CRD bridging)
Multi-cluster support
Collector health metrics dashboard generation

See open issues for the full list of proposed features.

Contributing

Contributions are welcome. The process is straightforward:

Fork the repo
Create a branch (git checkout -b feature/TailSamplingPolicies)
Make your changes and commit
Push and open a PR

If you're considering something bigger, open an issue first so we can align on the approach.

FAQ

Which MCP clients work with this?

Any MCP-compatible client including Claude Desktop, Cline, Cursor, and custom clients. Connect via http://localhost:8771/mcp for HTTP transport, or configure stdio for direct process communication.

Does this require the OpenTelemetry Operator?

The OTel Operator is required for Instrumentation CRD features (auto-instrumentation, annotation-based injection). Collector discovery and pipeline validation work with any OpenTelemetryCollector CRD in the cluster.

Does this modify my cluster?

Most tools are read-only. The exceptions are: otel_provision_collector (smart provisioning — auto-discovers and creates collectors), otel_patch_collector (expert-level CRD create/replace), otel_patch_instrumentation (creates/patches Instrumentation CRDs), and otel_annotate_deployment (adds annotations to Deployment pod templates, triggering rolling restarts). All four default to dry_run=True and require explicit opt-in to apply. All other tools generate YAML output only — they do NOT apply changes.

What's the difference between dry_run=True and dry_run=False?

With dry_run=True (default), mutating tools preview the change and return the spec/annotation without modifying any Kubernetes resources. With dry_run=False, the change is applied. The server always returns the generated spec so you can review before applying.

Can I use this without Kubernetes?

The server is designed for Kubernetes environments. Set K8S_ENABLED=false to disable Kubernetes integration, but most tools will return errors since they depend on CRD access. The language lookup tool (otel_lookup_instrumentation), topology recommendation (otel_recommend_collector_topology), and transform rule generation (otel_gen_drop_attribute_rules) work without K8s since they are pure recommendation engines.

What RBAC does `otel_provision_collector` create?

When provisioning a collector with dry_run=False, the tool automatically creates a ClusterRole and ClusterRoleBinding for the k8sattributes processor. The k8sattributes processor enriches telemetry with Kubernetes metadata (pod name, namespace, deployment name, etc.) and needs get, list, watch permissions on Pods, ReplicaSets, Namespaces, Nodes, and Jobs. The OTel Operator does not create these RBAC resources automatically. The created resources are labeled with app.kubernetes.io/managed-by: talkops-mcp for easy identification. In dry-run mode, the RBAC manifests are included in the rbac_resources field of the response for review.

Troubleshooting

Kubernetes Connection Issues

Verify K8S_ENABLED=true and your kubeconfig is accessible.
Load the otel://system/health resource to check connectivity status.
Run the MCP server with uv run mcp-server
The server will use your active kubeconfig context by default.

OTel Operator / CRD Issues

Ensure the OpenTelemetry Operator is installed: kubectl get crd opentelemetrycollectors.opentelemetry.io.
Verify OTEL_CRD_GROUP and OTEL_CRD_API_VERSION match your operator version.
For Instrumentation CRDs, verify: kubectl get crd instrumentations.opentelemetry.io.
Check RBAC: the server's service account needs get, list, watch, create, patch on OTel CRDs.

No Collectors Found

Run otel_list_collectors() without namespace filter to search all namespaces.
Check if collectors exist: kubectl get opentelemetrycollectors --all-namespaces.
Verify the OTEL_COLLECTOR_PLURAL env var matches your CRD plural name.

Auto-Instrumentation Not Working

Verify an Instrumentation CR exists in the target namespace.
Check that the Deployment has the correct annotation (e.g., instrumentation.opentelemetry.io/inject-python: "true").
Look for init container injection: kubectl describe pod <pod-name>.
Use otel_list_instrumented_services to diagnose annotation vs. injection mismatches.

Security Considerations

Never expose the MCP server to the public internet without proper authentication.
otel_provision_collector creates OpenTelemetryCollector CRDs — always review the dry_run output before setting dry_run=False. The tool auto-discovers backends and generates configs, so verify the discovered endpoints are correct.
otel_patch_collector creates or replaces OpenTelemetryCollector CRDs — review the spec before setting dry_run=False.
otel_patch_instrumentation creates real Kubernetes CRDs — review the spec before setting dry_run=False.
otel_annotate_deployment modifies Deployments — this triggers a rolling restart of all pods.
eBPF instrumentation pods may run with elevated privileges — use otel_analyze_ebpf_footprint to audit.

License

Apache 2.0 — see LICENSE.

Contact

TalkOps AI — github.com/talkops-ai

Project: github.com/talkops-ai/talkops-mcp

Discord: Join the community

Acknowledgments

Model Context Protocol for enabling AI-native tool interfaces.
FastMCP for the Python MCP server framework.
OpenTelemetry for the industry-standard observability framework.
Kubernetes for container orchestration APIs.
OpenTelemetry Operator for Kubernetes-native OTel lifecycle management.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

talkops

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.11

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

talkops_opentelemetry_mcp_server-0.1.11.tar.gz (286.1 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl (127.4 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file talkops_opentelemetry_mcp_server-0.1.11.tar.gz.

File metadata

Download URL: talkops_opentelemetry_mcp_server-0.1.11.tar.gz
Upload date: Jun 23, 2026
Size: 286.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talkops_opentelemetry_mcp_server-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`dec8580d598c637f352a1cb9feecc77451daa95a82349cef0dc05a849f0b0814`
MD5	`6a702b8b1bfe50a3f86f310d01540006`
BLAKE2b-256	`546126573713b826e9158dfbbccbbfe3e47a5155f2be3df9daa2df24c4ea6238`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talkops_opentelemetry_mcp_server-0.1.11.tar.gz:

Publisher: release-pypi.yml on talkops-ai/talkops-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talkops_opentelemetry_mcp_server-0.1.11.tar.gz
- Subject digest: dec8580d598c637f352a1cb9feecc77451daa95a82349cef0dc05a849f0b0814
- Sigstore transparency entry: 1921129602
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Branch / Tag: refs/heads/main
- Owner: https://github.com/talkops-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Trigger Event: workflow_dispatch

File details

Details for the file talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl.

File metadata

Download URL: talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 127.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`13769e9c1371f7a5f5b2bbe21e7241222c7b7178da445db975c57afb5ce3cbff`
MD5	`d766315b6b726addbd73110a0185040f`
BLAKE2b-256	`a07172b2b7e89e3e4f8c2bc077ed8e8b2ed3b769abd7027795e1334f45a5d043`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl:

Publisher: release-pypi.yml on talkops-ai/talkops-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl
- Subject digest: 13769e9c1371f7a5f5b2bbe21e7241222c7b7178da445db975c57afb5ce3cbff
- Sigstore transparency entry: 1921129859
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Branch / Tag: refs/heads/main
- Owner: https://github.com/talkops-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07
- Trigger Event: workflow_dispatch

talkops-opentelemetry-mcp-server 0.1.11

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenTelemetry MCP Server

Why OpenTelemetry MCP Server?

Key Features

Architecture

Table of Contents

Tech Stack

Getting Started

Prerequisites

Quick Start with Docker (recommended)

From Source (Python)

Configuration

Server Configuration

Kubernetes

OTel Operator CRD

Target Allocator

Prometheus Integration

Language Registry

Available Tools

Discovery

Collector Management

Instrumentation

Validation

Governance

Sampling

SpanMetrics

Available Resources

Available Prompts

Usage

Project Structure

Roadmap

Contributing

FAQ

Troubleshooting

Kubernetes Connection Issues

OTel Operator / CRD Issues

No Collectors Found

Auto-Instrumentation Not Working

Security Considerations

License

Contact

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance