A Model Context Protocol (MCP) server for OpenTelemetry Kubernetes observability.
Project description
OpenTelemetry MCP Server
An MCP server that gives AI assistants the power to discover, provision, instrument, validate, and govern OpenTelemetry pipelines on Kubernetes — from intent-driven collector provisioning to cardinality control, sampling optimization, and security auditing.
Quick Start · Docs · Report Bug · Request Feature
Why OpenTelemetry MCP Server?
The problem: OpenTelemetry is the industry standard for observability, but operating it on Kubernetes is complex. Setting up auto-instrumentation across languages, configuring collector pipelines with the correct processor ordering, tuning sampling strategies, and managing cardinality from SpanMetrics connectors — each is a specialized task. When AI assistants try to help, they hallucinate CRD schemas, mis-order processors, or generate unsafe configs that cause data loss.
The solution: The OpenTelemetry MCP Server gives AI assistants (like Claude, Cline, or Cursor) structured, safe tools to manage the entire OTel lifecycle natively:
- Intent-Driven Collector Provisioning: Say "I want traces and metrics in my namespace" and the AI auto-discovers backends (Jaeger, Tempo, Prometheus, Loki, OpenSearch), generates best-practice configs with correct processor ordering, selects the right deployment mode, sizes resources for your cluster — and deploys with dry-run-first safety.
- Zero-to-Instrumented Onboarding: The AI looks up language support, creates an Instrumentation CRD, annotates Deployments for auto-instrumentation injection, and verifies the rollout — all with dry-run-first safety.
- Pipeline Investigation & Validation: Deep-inspect any OTel Collector's config, validate processor ordering against best practices, audit filelog receiver safety, and check k8sattributes enrichment profiles.
- Metric Cardinality Governance: Detect high-cardinality dimensions from SpanMetrics connectors, generate transform processor YAML to drop attributes, and estimate series counts before they explode.
- Sampling Strategy Optimization: Cross-reference head sampling (Instrumentation CRDs) with tail sampling (collector config), detect conflicts, and generate config patches to switch strategies.
- Security Posture Auditing: Scan eBPF instrumentation pods for privileged mode, SYS_ADMIN capabilities, and hostPID access. Risk-assess the entire observability footprint.
Key Features
Intent-Driven Collector Provisioning
- Express what you want (signals, namespace) — the tool auto-discovers everything else
- Three-strategy backend discovery: existing collector configs → K8s service name matching → graceful debug fallback
- 10 built-in backend patterns: Jaeger, Tempo, Zipkin, Prometheus, Thanos, Mimir, VictoriaMetrics, OpenSearch, Elasticsearch, Loki
- Best-practice processor chain always enforced:
memory_limiter → k8sattributes → resourcedetection → resource → batch - Smart mode selection: DaemonSet for filelog, StatefulSet for Prometheus scraping, Deployment for OTLP
- Auto-sizing from cluster scale (node count → small/medium/large resource tiers)
- Filelog safety built-in: self-exclusion patterns, namespace scoping, checkpoint storage,
start_at=end - SpanMetrics connector wiring with correct traces→connector→metrics pipeline topology
- Proactive recommendations ("Consider spanmetrics for RED metrics", "Filelog needs DaemonSet")
Collector Discovery & Inspection
- List and inspect OpenTelemetryCollector CRDs across namespaces with pagination
- Full pipeline topology: receivers, processors, exporters, and connectors
- Summary and full (raw YAML) detail levels
Service Instrumentation
- Language support matrix with framework-specific guidance (Java, Python, Node.js, .NET, Go, Rust)
- Create or patch Instrumentation CRDs with sampler, propagators, and per-language images
- Annotate Deployments for auto-instrumentation with dry-run-first safety
- List instrumented services with annotation status, init container injection, signal detection, and 4-tier language detection (annotations → image patterns → container/deployment names → runtime env vars)
Pipeline Validation
- Processor ordering validation (memory_limiter → k8sattributes → batch)
- Filelog receiver safety checks (checkpoint storage, self-collection loops, resource detection)
- Target Allocator state inspection (allocation strategy, selectors, prometheusCR)
- Collector topology recommendations based on signals, workload count, and cluster size
Metric Cardinality Governance
- SpanMetrics dimension analysis with series count estimation
- Histogram bucket count auditing
- Transform processor YAML generation for dropping attributes
- Existing remediation detection (transform processors already in config)
Sampling Management
- Holistic sampling view: head (Instrumentation CRD) + tail (collector config)
- Conflict detection (head + tail simultaneously)
- Config patch generation for switching between head, tail, or none
- Tail sampling policy templates (error-sampling, slow-traces, probabilistic-fallback)
SpanMetrics Connector
- Inspect existing SpanMetrics configuration (dimensions, histograms, pipeline wiring)
- Generate SpanMetrics enablement YAML with custom dimensions and bucket boundaries
- Cardinality warnings for high-dimension configurations
Security Auditing
- eBPF agent discovery (OpenTelemetry eBPF, Grafana Beyla)
- Security context analysis: privileged mode, hostPID, capabilities, host volume mounts
- Risk assessment with prioritized remediation recommendations
Production-Ready Middleware
- Response limiting (100KB max), rate limiting (10 req/s, burst 20)
- Response caching, structured logging, error handling, timing
Architecture
┌─────────────────────────┐
│ MCP Client │
│ (Claude, Cline, Cursor) │
└──────────┬──────────────┘
│
┌──────────▼──────────────┐
│ FastMCP Server Core │
│ (HTTP / SSE / stdio) │
│ + Middleware Stack │
└──────────┬──────────────┘
│
┌────────────┬───────────┼───────────┬────────────┐
│ │ │ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Tools │ │Resources│ │ Prompts │ │ Utils │ │ Models │
│ (19) │ │ (9) │ │ (5) │ │ │ │ │
└────┬────┘ └────┬────┘ └─────────┘ └─────────┘ └─────────┘
│ │
└──────┬─────┘
│
┌──────────▼──────────┐
│ Service Layer │
│ │
│ kubernetes_service │
│ collector_config_svc │
│ config_builder │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Python K8s Client │
│ + OTel Operator CRDs│
└─────────────────────┘
How it works:
- An AI assistant connects via HTTP, SSE, or stdio.
- The AI loads
otel://system/healthresource to check Kubernetes connectivity and CRD availability. - Tools interact with the Kubernetes API to read/write OpenTelemetryCollector and Instrumentation CRDs.
- Service layers (
kubernetes_service,collector_config_service) handle API calls and config parsing. - Middleware enforces rate limiting, response size caps, caching, and structured logging.
Table of Contents
- Why OpenTelemetry MCP Server?
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Configuration
- Available Tools
- Available Resources
- Available Prompts
- Usage
- Project Structure
- Roadmap
- Contributing
- FAQ
- Troubleshooting
- Security Considerations
- License
- Contact
- Acknowledgments
Tech Stack
| Category | Technologies |
|---|---|
| Language | Python 3.12+ |
| MCP Framework | FastMCP ≥2.13.3 |
| Protocol | Model Context Protocol (MCP) |
| OpenTelemetry | Operator CRDs · Instrumentation CRDs · Collector config |
| Kubernetes | Python K8s Client · Custom Resources · RBAC |
| Transport | HTTP · SSE · Streamable-HTTP · stdio |
| Infrastructure | Docker · uv |
Getting Started
Prerequisites
- Docker (recommended) or Python 3.12+ (for local dev)
- Kubernetes cluster with the OpenTelemetry Operator installed
- kubectl configured with access to the target cluster
RBAC Note for Collector Provisioning (
otel_provision_collector)When provisioning collectors with
dry_run=False, the tool automatically createsClusterRoleandClusterRoleBindingresources for thek8sattributesprocessor. This is necessary because:
- The
k8sattributesprocessor enriches telemetry with Kubernetes metadata (pod name, namespace, node, etc.)- It needs
get,list,watchpermissions on Pods, ReplicaSets, Namespaces, Nodes, and Jobs- The OTel Operator does NOT auto-create these RBAC resources — without them, the collector will log
pods is forbiddenerrorsPrerequisites for the MCP server's own ServiceAccount:
- The ServiceAccount running the MCP server needs permissions to
create,patch,getonClusterRoleandClusterRoleBindingresources- For local development (kubeconfig), your kubectl user typically already has cluster-admin
- For in-cluster deployment, add these rules to the MCP server's ServiceAccount:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: otel-mcp-server-rbac-manager rules: - apiGroups: ["rbac.authorization.k8s.io"] resources: ["clusterroles", "clusterrolebindings"] verbs: ["get", "create", "patch"]See SUPPORTED_BACKENDS.md for the full list of auto-discoverable backends and default ports.
Quick Start with Docker (recommended)
docker run --rm -it \
-p 8771:8771 \
-e MCP_TRANSPORT=http \
-e K8S_IN_CLUSTER=true \
talkopsai/opentelemetry-mcp-server:latest
The server is now listening on http://localhost:8771/mcp.
Point your MCP client at it:
{
"mcpServers": {
"opentelemetry": {
"url": "http://localhost:8771/mcp",
"description": "MCP Server for OpenTelemetry Kubernetes observability"
}
}
}
From Source (Python)
-
Install uv for dependency management.
-
Clone and set up:
git clone https://github.com/talkops-ai/talkops-mcp.git
cd talkops-mcp/src/opentelemetry-mcp-server
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
- Configure your
.env:
MCP_TRANSPORT=http
K8S_ENABLED=true
MCP_LOG_LEVEL=INFO
- Run the server:
uv run opentelemetry-mcp-server
Or, with the venv activated: opentelemetry-mcp-server.
- Run tests:
source .venv/bin/activate
pytest tests/
Configuration
All configuration is via environment variables (loaded from .env via python-dotenv).
Server Configuration
| Variable | Default | Description |
|---|---|---|
MCP_SERVER_NAME |
opentelemetry-mcp-server |
Server name identifier |
MCP_SERVER_VERSION |
0.1.0 |
Server version string |
MCP_TRANSPORT |
stdio |
Transport mode: http, sse, streamable-http, or stdio |
MCP_HOST |
0.0.0.0 |
Host address for HTTP server |
MCP_PORT |
8771 |
Port for HTTP server |
MCP_PATH |
/mcp |
MCP endpoint path |
MCP_LOG_LEVEL |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR |
MCP_LOG_FORMAT |
json |
Log format: json or text |
MCP_HTTP_TIMEOUT |
300 |
HTTP server timeout (seconds) |
MCP_HTTP_KEEPALIVE_TIMEOUT |
5 |
HTTP keepalive timeout (seconds) |
MCP_HTTP_CONNECT_TIMEOUT |
60 |
HTTP connect timeout (seconds) |
Kubernetes
| Variable | Default | Description |
|---|---|---|
K8S_IN_CLUSTER |
false |
Set true if running inside a pod |
K8S_ENABLED |
true |
Enable/disable K8s features entirely |
OTel Operator CRD
| Variable | Default | Description |
|---|---|---|
OTEL_CRD_GROUP |
opentelemetry.io |
API group for OTel CRDs |
OTEL_CRD_API_VERSION |
v1beta1 |
API version for Collector CRDs (OpenTelemetryCollector) |
OTEL_INSTRUMENTATION_API_VERSION |
v1alpha1 |
API version for Instrumentation CRDs (separate because Instrumentation CRDs are promoted at a different rate than Collector CRDs) |
OTEL_COLLECTOR_PLURAL |
opentelemetrycollectors |
Plural name for Collector CRD |
OTEL_INSTRUMENTATION_PLURAL |
instrumentations |
Plural name for Instrumentation CRD |
Target Allocator
| Variable | Default | Description |
|---|---|---|
OTEL_TA_SERVICE_DISCOVERY |
true |
Enable Target Allocator service discovery |
OTEL_TA_DEFAULT_PORT |
8080 |
Default Target Allocator port |
Prometheus Integration
| Variable | Default | Description |
|---|---|---|
PROMETHEUS_BASE_URL |
(empty) | Prometheus HTTP API base URL (for cardinality queries) |
PROMETHEUS_TIMEOUT |
30 |
HTTP timeout for Prometheus API calls (seconds) |
PROMETHEUS_VERIFY_SSL |
true |
Verify SSL certificates |
Language Registry
| Variable | Default | Description |
|---|---|---|
OTEL_LANG_REGISTRY_PATH |
(builtin) | Override path to custom language registry JSON |
Available Tools
Discovery
| Tool | Description |
|---|---|
otel_list_collectors |
List OpenTelemetryCollector CRDs with namespace filtering, label selectors, and pagination. |
otel_query_a2ui |
Retrieve the status of all OpenTelemetry Collectors and their pipelines, formatted precisely for A2UI Status Datatables. Includes deep pipeline metrics health checking via internal :8888/metrics endpoints. |
otel_get_collector |
Get detailed information about a specific collector (pipelines, status, raw YAML config). |
otel_list_instrumented_services |
List workloads in a namespace with their auto-instrumentation status, annotations, init containers, OTEL_* env vars, and detected language (4-tier: annotations → image patterns → container names → runtime env vars like JAVA_HOME, PYTHONPATH). |
Collector Management
| Tool | Description |
|---|---|
otel_provision_collector |
Smart, intent-driven collector provisioning. Accepts namespace + signals (e.g., ["traces", "metrics"]), auto-discovers backend endpoints from existing collectors and K8s services, generates best-practice configs with correct processor ordering, selects deployment mode, and sizes resources from cluster scale. Automatically creates RBAC (ClusterRole + ClusterRoleBinding) for the k8sattributes processor. Supports enable_spanmetrics, enable_filelog, prometheus_scrape, and dry_run modes. See Supported Backends for auto-discoverable backends. |
otel_patch_collector |
Expert-level CRD management. Create or replace an OpenTelemetryCollector CRD with full config YAML, dynamic labels, annotations, and spec. Supports overwrite (full replace with resourceVersion) and dry_run modes. |
Instrumentation
| Tool | Description |
|---|---|
otel_lookup_instrumentation |
Map a language and optional framework to OTel instrumentation support (auto-instrumentation availability, annotation key, SDK package). |
otel_patch_instrumentation |
Create or patch an Instrumentation CRD with exporter endpoint, propagators, sampler, and per-language images. Supports dry_run. |
otel_annotate_deployment |
Apply auto-instrumentation annotation to a Deployment's pod template. Detects conflicting hardcoded OTEL_* env vars (e.g., OTEL_EXPORTER_OTLP_ENDPOINT) that would silently override Operator-injected endpoints, and warns with remediation steps. Supports dry_run. Triggers rolling restart when applied. |
Validation
| Tool | Description |
|---|---|
otel_validate_k8sattributes_order |
Validate processor ordering in collector pipelines against recommended order (memory_limiter → k8sattributes → resourcedetection → transform → filter → tail_sampling → batch). |
otel_check_filelog_safety |
Check filelog receiver for safety issues: missing checkpoint storage, self-collection feedback loops, and missing resource detection. |
otel_inspect_target_allocator_state |
Inspect Target Allocator configuration: allocation strategy, ServiceMonitor/PodMonitor selectors, prometheusCR enablement, replicas. |
otel_recommend_collector_topology |
Recommend collector deployment mode (DaemonSet/Deployment/Gateway), pipeline topology, and resource sizing based on signals, workload count, and cluster size. |
Governance
| Tool | Description |
|---|---|
otel_detect_cardinality |
Detect metric cardinality issues from SpanMetrics dimensions and histogram buckets. Returns estimated series counts and severity ratings. |
otel_gen_drop_attribute_rules |
Generate transform processor YAML snippet to drop high-cardinality attributes for metrics, traces, or logs signals. |
otel_analyze_ebpf_footprint |
Scan eBPF instrumentation pods for security posture: privileged mode, hostPID, Linux capabilities, and host volume mounts. |
Sampling
| Tool | Description |
|---|---|
otel_inspect_sampling_configuration |
Inspect complete sampling config: cross-references head sampling (Instrumentation CRD) with tail sampling (collector config). Detects conflicts. |
otel_toggle_sampling_strategy |
Generate config patches to switch between head, tail, or no sampling. Includes tail sampling policy templates. Supports dry_run. |
SpanMetrics
| Tool | Description |
|---|---|
otel_inspect_spanmetrics_config |
Inspect SpanMetrics connector configuration: dimensions, histogram config, pipeline wiring, and cardinality estimates. |
otel_enable_spanmetrics_for_service |
Generate SpanMetrics connector YAML with custom dimensions, histogram buckets, and pipeline wiring instructions. Supports dry_run. |
Available Resources
| Resource URI | Description |
|---|---|
otel://system/health |
Server health status: Kubernetes connectivity, OTel CRD availability, and server version |
otel://collector/{namespace}/{name} |
Full collector config: pipeline topology, receivers, processors, exporters, deployment mode, and status |
otel://k8s-enrichment/{namespace}/{collector} |
k8sattributes processor profile: extracted metadata, labels, annotations, pod association, and pipeline positions |
otel://logs-profile/{namespace}/{collector} |
Filelog receiver config: include/exclude paths, operators, safety analysis, and pipeline wiring |
otel://spanmetrics/{namespace}/{collector} |
SpanMetrics connector profile: dimensions, histogram config, pipeline wiring, and cardinality estimates |
otel://instrumentation/{namespace}/{name} |
Instrumentation CRD details: exporter endpoint, propagators, sampler, per-language specs, and resource attributes |
otel://target-allocator/{namespace}/{name} |
Target Allocator state: allocation strategy, ServiceMonitor/PodMonitor selectors, replicas, and prometheusCR status |
otel://lang/{language} |
Per-language instrumentation capabilities: signal support, auto-instrumentation availability, framework support, SDK package |
otel://registry/languages |
Full catalog of all supported languages with signal stability, auto-instrumentation, and framework support matrices |
Available Prompts
Guided workflow prompts that orchestrate multiple tools into step-by-step journeys:
| Prompt Name | Description | Parameters |
|---|---|---|
otel_onboard_service |
Guided workflow for onboarding a new service to OpenTelemetry: language detection, Instrumentation CR setup, annotation application, and verification | service_name, namespace, language |
otel_investigate_pipeline |
Guided workflow for investigating an OTel pipeline: processor ordering, filelog safety, sampling config, and enrichment profile | collector_name, namespace |
otel_cardinality_audit |
Guided workflow for auditing metric cardinality: detect high-cardinality dimensions and generate transform processor remediation YAML | collector_name, namespace |
otel_sampling_review |
Guided workflow for reviewing and optimizing sampling strategy across Instrumentation CRDs and collector config | collector_name, namespace |
otel_security_audit |
Guided workflow for auditing OTel security posture: eBPF privileges, init containers, RBAC, and sensitive attribute exposure | namespace |
Usage
Supported workflows with prompt examples and links to detailed guides:
| Workflow | Prompt Example | Documentation |
|---|---|---|
| Service Onboarding | "Onboard my Python app 'api-server' in the 'production' namespace to OpenTelemetry." |
OTEL_ONBOARDING_TEST_GUIDE.md |
| Pipeline Investigation | "Investigate the OTel collector 'otel-gateway' in the 'monitoring' namespace." |
OTEL_PIPELINE_INVESTIGATION_TEST_GUIDE.md |
| Cardinality Audit | "Audit metric cardinality for collector 'otel-metrics' in 'monitoring'." |
OTEL_CARDINALITY_AUDIT_TEST_GUIDE.md |
| Sampling Review | "Review sampling for collector 'otel-traces' in 'monitoring'." |
OTEL_SAMPLING_TEST_GUIDE.md |
| Security Audit | "Audit OTel security posture in the 'production' namespace." |
OTEL_SECURITY_AUDIT_TEST_GUIDE.md |
See WORKFLOW_JOURNEYS.md for the full workflow reference and PROMPT_REFERENCE.md for natural-language prompts.
Project Structure
opentelemetry-mcp-server/
├── opentelemetry_mcp_server/ # Main package
│ ├── tools/ # MCP Tools (7 tool groups, 19 tools)
│ │ ├── discovery/ # Collector & service discovery
│ │ ├── collector/ # Collector CRD management
│ │ │ ├── collector_tools.py # Expert-level CRD create/replace
│ │ │ └── provision_tools.py # Intent-driven smart provisioning (NEW)
│ │ ├── instrumentation/ # Language lookup & CRD management
│ │ ├── validation/ # Pipeline validation & safety checks
│ │ ├── governance/ # Cardinality & eBPF governance
│ │ ├── sampling/ # Sampling inspection & toggle
│ │ └── spanmetrics/ # SpanMetrics connector management
│ ├── resources/ # MCP Resources (9 URIs)
│ │ └── otel_resources.py # Collector, enrichment, logs, spanmetrics,
│ │ # instrumentation, target allocator, language,
│ │ # and system health resources
│ ├── prompts/ # MCP Prompts (5 guided workflows)
│ │ └── otel_prompts.py # Onboarding, investigation, cardinality,
│ │ # sampling, and security audit prompts
│ ├── services/ # Business logic
│ │ ├── kubernetes_service.py # K8s API wrapper (CRDs, Deployments, Pods, Services)
│ │ ├── collector_config_service.py # Collector YAML parser & analyzer
│ │ └── collector_config_builder.py # Intent→config generation engine (NEW)
│ ├── server/ # FastMCP server setup
│ │ ├── core.py # Server creation
│ │ ├── bootstrap.py # Component initialization
│ │ └── middleware.py # 7-layer middleware stack
│ ├── models/ # Pydantic data models
│ ├── utils/ # Helpers
│ │ ├── k8s_labels.py # Annotation keys, eBPF agent labels, language detection
│ │ ├── yaml_helpers.py # YAML parsing, pipeline extraction
│ │ ├── pagination.py # Cursor-based pagination
│ │ └── duration.py # OTel duration string parsing (e.g., "2ms" → 2.0)
│ ├── static/ # Static data files
│ │ ├── otel_lang_registry.json # Language support matrix
│ │ └── OTEL_MCP_INSTRUCTIONS.md # MCP system instructions
│ ├── exceptions/ # Custom exception hierarchy
│ ├── config.py # Environment parsing & config dataclasses
│ └── main.py # Entry point
├── tests/ # Test suites (371 tests)
├── docs/ # Documentation
├── pyproject.toml # Package definitions (Python 3.12)
├── Dockerfile # Multi-stage Docker build
└── README.md # This documentation
Roadmap
Shipped:
- Collector discovery with namespace filtering, label selectors, and pagination
- Deep collector inspection with full pipeline topology and raw YAML
- Language instrumentation lookup with framework-specific guidance
- Instrumentation CRD creation/patching with dry-run safety
- Deployment annotation for auto-instrumentation injection
- Instrumented service listing with annotation and init container detection
- Processor ordering validation against OTel best practices
- Filelog receiver safety auditing
- Target Allocator state inspection
- Collector topology recommendation engine
- SpanMetrics cardinality analysis and series estimation
- Transform processor YAML generation for attribute dropping
- eBPF instrumentation security auditing
- Sampling configuration inspection (head + tail cross-reference)
- Sampling strategy toggle with config patch generation
- SpanMetrics connector enablement with custom dimensions
- Collector CRD management (create/replace with dry-run safety)
- 4-tier language detection (annotations → images → names → runtime env vars)
- OTel duration string parsing in histogram bucket configs
- 5 guided workflow prompts for onboarding, investigation, cardinality, sampling, and security
- 7-layer middleware stack (rate limiting, response limiting, caching)
- Intent-driven collector provisioning (
otel_provision_collector) — auto-discovery, best-practice configs, smart mode selection - Auto-discovery engine: existing collectors → K8s services → debug fallback
- 10 backend patterns: Jaeger, Tempo, Zipkin, Prometheus, Thanos, Mimir, VictoriaMetrics, OpenSearch, Elasticsearch, Loki
- Cluster auto-sizing (node count → resource recommendations)
- SpanMetrics connector wiring with correct pipeline topology
- Filelog safety built-in (self-exclusion, namespace scoping, checkpoints)
Coming next:
- Collector config diffing (before/after patch comparison)
- Alerting rule integration (OTel → PrometheusRule CRD bridging)
- Multi-cluster support
- Collector health metrics dashboard generation
See open issues for the full list of proposed features.
Contributing
Contributions are welcome. The process is straightforward:
- Fork the repo
- Create a branch (
git checkout -b feature/TailSamplingPolicies) - Make your changes and commit
- Push and open a PR
If you're considering something bigger, open an issue first so we can align on the approach.
FAQ
Which MCP clients work with this?
Any MCP-compatible client including Claude Desktop, Cline, Cursor, and custom clients. Connect viahttp://localhost:8771/mcp for HTTP transport, or configure stdio for direct process communication.
Does this require the OpenTelemetry Operator?
The OTel Operator is required for Instrumentation CRD features (auto-instrumentation, annotation-based injection). Collector discovery and pipeline validation work with any OpenTelemetryCollector CRD in the cluster.Does this modify my cluster?
Most tools are read-only. The exceptions are:otel_provision_collector (smart provisioning — auto-discovers and creates collectors), otel_patch_collector (expert-level CRD create/replace), otel_patch_instrumentation (creates/patches Instrumentation CRDs), and otel_annotate_deployment (adds annotations to Deployment pod templates, triggering rolling restarts). All four default to dry_run=True and require explicit opt-in to apply. All other tools generate YAML output only — they do NOT apply changes.
What's the difference between dry_run=True and dry_run=False?
Withdry_run=True (default), mutating tools preview the change and return the spec/annotation without modifying any Kubernetes resources. With dry_run=False, the change is applied. The server always returns the generated spec so you can review before applying.
Can I use this without Kubernetes?
The server is designed for Kubernetes environments. SetK8S_ENABLED=false to disable Kubernetes integration, but most tools will return errors since they depend on CRD access. The language lookup tool (otel_lookup_instrumentation), topology recommendation (otel_recommend_collector_topology), and transform rule generation (otel_gen_drop_attribute_rules) work without K8s since they are pure recommendation engines.
What RBAC does `otel_provision_collector` create?
When provisioning a collector withdry_run=False, the tool automatically creates a ClusterRole and ClusterRoleBinding for the k8sattributes processor. The k8sattributes processor enriches telemetry with Kubernetes metadata (pod name, namespace, deployment name, etc.) and needs get, list, watch permissions on Pods, ReplicaSets, Namespaces, Nodes, and Jobs. The OTel Operator does not create these RBAC resources automatically. The created resources are labeled with app.kubernetes.io/managed-by: talkops-mcp for easy identification. In dry-run mode, the RBAC manifests are included in the rbac_resources field of the response for review.
Troubleshooting
Kubernetes Connection Issues
- Verify
K8S_ENABLED=trueand your kubeconfig is accessible. - Load the
otel://system/healthresource to check connectivity status. - Run the MCP server with
uv run mcp-server - The server will use your active kubeconfig context by default.
OTel Operator / CRD Issues
- Ensure the OpenTelemetry Operator is installed:
kubectl get crd opentelemetrycollectors.opentelemetry.io. - Verify
OTEL_CRD_GROUPandOTEL_CRD_API_VERSIONmatch your operator version. - For Instrumentation CRDs, verify:
kubectl get crd instrumentations.opentelemetry.io. - Check RBAC: the server's service account needs
get,list,watch,create,patchon OTel CRDs.
No Collectors Found
- Run
otel_list_collectors()without namespace filter to search all namespaces. - Check if collectors exist:
kubectl get opentelemetrycollectors --all-namespaces. - Verify the
OTEL_COLLECTOR_PLURALenv var matches your CRD plural name.
Auto-Instrumentation Not Working
- Verify an Instrumentation CR exists in the target namespace.
- Check that the Deployment has the correct annotation (e.g.,
instrumentation.opentelemetry.io/inject-python: "true"). - Look for init container injection:
kubectl describe pod <pod-name>. - Use
otel_list_instrumented_servicesto diagnose annotation vs. injection mismatches.
Security Considerations
- Never expose the MCP server to the public internet without proper authentication.
otel_provision_collectorcreates OpenTelemetryCollector CRDs — always review the dry_run output before settingdry_run=False. The tool auto-discovers backends and generates configs, so verify the discovered endpoints are correct.otel_patch_collectorcreates or replaces OpenTelemetryCollector CRDs — review the spec before settingdry_run=False.otel_patch_instrumentationcreates real Kubernetes CRDs — review the spec before settingdry_run=False.otel_annotate_deploymentmodifies Deployments — this triggers a rolling restart of all pods.- eBPF instrumentation pods may run with elevated privileges — use
otel_analyze_ebpf_footprintto audit.
License
Apache 2.0 — see LICENSE.
Contact
TalkOps AI — github.com/talkops-ai
Project: github.com/talkops-ai/talkops-mcp
Discord: Join the community
Acknowledgments
- Model Context Protocol for enabling AI-native tool interfaces.
- FastMCP for the Python MCP server framework.
- OpenTelemetry for the industry-standard observability framework.
- Kubernetes for container orchestration APIs.
- OpenTelemetry Operator for Kubernetes-native OTel lifecycle management.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file talkops_opentelemetry_mcp_server-0.1.11.tar.gz.
File metadata
- Download URL: talkops_opentelemetry_mcp_server-0.1.11.tar.gz
- Upload date:
- Size: 286.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dec8580d598c637f352a1cb9feecc77451daa95a82349cef0dc05a849f0b0814
|
|
| MD5 |
6a702b8b1bfe50a3f86f310d01540006
|
|
| BLAKE2b-256 |
546126573713b826e9158dfbbccbbfe3e47a5155f2be3df9daa2df24c4ea6238
|
Provenance
The following attestation bundles were made for talkops_opentelemetry_mcp_server-0.1.11.tar.gz:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_opentelemetry_mcp_server-0.1.11.tar.gz -
Subject digest:
dec8580d598c637f352a1cb9feecc77451daa95a82349cef0dc05a849f0b0814 - Sigstore transparency entry: 1921129602
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl.
File metadata
- Download URL: talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl
- Upload date:
- Size: 127.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13769e9c1371f7a5f5b2bbe21e7241222c7b7178da445db975c57afb5ce3cbff
|
|
| MD5 |
d766315b6b726addbd73110a0185040f
|
|
| BLAKE2b-256 |
a07172b2b7e89e3e4f8c2bc077ed8e8b2ed3b769abd7027795e1334f45a5d043
|
Provenance
The following attestation bundles were made for talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_opentelemetry_mcp_server-0.1.11-py3-none-any.whl -
Subject digest:
13769e9c1371f7a5f5b2bbe21e7241222c7b7178da445db975c57afb5ce3cbff - Sigstore transparency entry: 1921129859
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@c7e249bccb61b37ec3c2746325c14fe1f0a3cf07 -
Trigger Event:
workflow_dispatch
-
Statement type: