A Model Context Protocol (MCP) server for querying logs from multiple observability platforms (New Relic, Azure)

These details have not been verified by PyPI

Project links

Project description

MCP Observability Server

A Model Context Protocol (MCP) server that enables Claude to query logs from multiple observability platforms simultaneously. Perfect for SRE workflows, incident investigation, and distributed tracing.

Supported Platforms

New Relic - Query logs using NRQL
Azure Application Insights - Query logs using Kusto Query Language (KQL)

Features

🔍 Unified Search - Search across all platforms with a single query
🎯 Severity Filtering - Filter by log levels (debug, info, warning, error, critical)
🔗 Distributed Tracing - Find all logs related to a trace ID across platforms
⚡ Concurrent Queries - Queries all providers in parallel for fast results
📊 Recent Errors - Quick access to recent error logs across all systems
🏥 Health Checks - Verify connectivity to all configured providers
🤖 Guided Workflows - Pre-built prompts for incident investigation, deployment validation, and root cause analysis

Installation

From PyPI

pip install mcp-observability-server

From Source

git clone https://github.com/yourusername/mcp-observability-server.git
cd mcp-observability-server
pip install -e .

Configuration

1. Create Configuration File

Copy the example configuration:

cp config.yaml.example config.yaml

Edit config.yaml with your credentials:

providers:
  newrelic:
    enabled: true
    api_key: ${NEW_RELIC_API_KEY}
    account_id: "1234567"
    region: "US"
  
  azure:
    enabled: true
    workspace_id: ${AZURE_WORKSPACE_ID}
    client_id: ${AZURE_CLIENT_ID}
    client_secret: ${AZURE_CLIENT_SECRET}
    tenant_id: ${AZURE_TENANT_ID}

2. Set Environment Variables

Copy and configure environment variables:

cp .env.example .env

Edit .env with your actual credentials.

3. Configure Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "observability": {
      "command": "python",
      "args": ["-m", "mcp_observability.server", "/path/to/config.yaml"]
    }
  }
}

Usage

Once configured, you can ask Claude to query your logs:

Example Queries

Search for errors in the last hour:

Show me all errors from the last hour across all platforms

Search specific text:

Find logs containing "timeout" from the last 30 minutes

Filter by service:

Show me warning and error logs from the api-gateway service in the last 2 hours

Distributed tracing:

Find all logs related to trace ID abc123-def456

Recent errors:

What errors have occurred in the last 15 minutes?

Available Tools

The server exposes these tools to Claude:

`query_logs`

Search logs across all platforms with flexible filtering.

Parameters:

start_time (required) - ISO format or relative (e.g., "1h", "30m", "2d")
end_time (optional) - Defaults to now
query (optional) - Text to search for
severity (optional) - Array of severity levels
service_name (optional) - Filter by service
limit (optional) - Max results (default: 100)
providers (optional) - Specific providers to query

`get_recent_errors`

Quick access to recent error and critical logs.

Parameters:

minutes (optional) - Look back period (default: 60)
limit (optional) - Max results per provider (default: 100)
service_name (optional) - Filter by service

`search_by_trace_id`

Find all logs associated with a distributed trace.

Parameters:

trace_id (required) - The trace ID to search for
start_time (optional) - Defaults to 24 hours ago
end_time (optional) - Defaults to now

`health_check`

Verify connectivity to all configured providers.

Guided Workflows (Prompts)

The server provides guided prompts for common SRE workflows. Prompts chain multiple tools together and provide structured analysis frameworks.

`investigate-incident`

Systematic incident investigation workflow.

Use for: Active production incidents requiring thorough investigation
Parameters:

service_name (optional) - Service to investigate
time_period (default: "1h") - Investigation time window
severity_threshold (default: "error") - Minimum severity

Example:

Use the investigate-incident prompt for api-gateway service

Workflow: Recent errors → Pattern analysis → Trace investigation → Health checks → Summary with recommendations

`health-check-report`

Generate comprehensive health status report.

Use for: Daily health checks, system status overviews
Parameters:

time_period (default: "24h") - Error statistics period
include_metrics (default: true) - Include detailed metrics

Example:

Generate a health check report

Workflow: Provider health → Error analysis → Service catalog → Active traces → Recommendations

`post-deployment-check`

Validate deployment health by comparing before/after metrics.

Use for: Post-deployment validation, CI/CD pipelines
Parameters:

service_name (required) - Deployed service name
deployment_time (optional) - When deployment occurred
lookback_minutes (default: 30) - Baseline comparison period

Example:

Run a post-deployment check for user-service

Workflow: Current errors → Baseline comparison → New error detection → Trace analysis → Health recommendation (PROCEED/MONITOR/ROLLBACK)

`trace-flow-analysis`

Analyze distributed trace execution flow and timing.

Use for: Debugging distributed systems, understanding request flow
Parameters:

trace_id (required) - Trace ID to analyze
include_timing (default: true) - Include timing breakdown

Example:

Analyze trace flow for abc123-def456

Workflow: Timeline construction → Service chain mapping → Timing analysis → Error detection → Bottleneck identification → Root cause

`root-cause-analysis`

Deep root cause investigation for complex failures.

Use for: Finding originating causes, cascading failure analysis
Parameters:

trace_id (optional) - Specific trace to investigate
error_pattern (optional) - Known error pattern
time_window (default: "1h") - Investigation window

Example:

Perform root cause analysis for error "database connection timeout"

Workflow: Evidence gathering → Timeline building → Trace flow → Pattern recognition → Root cause formulation → Prevention recommendations

See Prompts README for detailed documentation.

Platform-Specific Configuration

New Relic

Create an API key in New Relic (User > API Keys)
Find your account ID in the URL or account dropdown
Choose region: "US" or "EU"

Azure Application Insights

Create a service principal in Azure AD
Grant "Log Analytics Reader" role to the service principal
Note the workspace ID, client ID, client secret, and tenant ID

Development

Setup Development Environment

# Clone repository
git clone https://github.com/yourusername/mcp-observability-server.git
cd mcp-observability-server

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black src/
ruff check src/

Testing with MCP Inspector

Test the server interactively using the MCP Inspector:

npx @modelcontextprotocol/inspector \
  uv \
  --directory /home/gagan/mcp-observability-server \
  run \
  mcp-observability \
  /home/gagan/mcp-observability-server/config.yaml

Or using the Python module directly:

npx @modelcontextprotocol/inspector \
  uv \
  --directory /home/gagan/mcp-observability-server \
  run \
  python \
  -m \
  mcp_observability.server \
  /home/gagan/mcp-observability-server/config.yaml

Project Structure

mcp-observability-server/
├── src/
│   └── mcp_observability/
│       ├── __init__.py
│       ├── server.py           # Main MCP server
│       ├── models.py            # Data models
│       ├── utils.py             # Utilities
│       ├── prompts/             # Guided workflow prompts
│       │   ├── __init__.py
│       │   ├── incident.py      # Incident investigation prompts
│       │   ├── health.py        # Health monitoring prompts
│       │   ├── deployment.py    # Deployment validation prompts
│       │   ├── trace_analysis.py # Trace flow analysis prompts
│       │   └── README.md        # Prompts documentation
│       └── providers/
│           ├── base.py          # Abstract base
│           ├── newrelic.py
│           ├── azure.py
│           
│           
├── tests/
├── config.yaml.example
├── .env.example
└── pyproject.toml

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=mcp_observability

# Run specific test file
pytest tests/test_providers.py

Logging

The server includes comprehensive logging to help with debugging and monitoring:

Configure Log Level:

Set the MCP_LOG_LEVEL environment variable:

# In your .env file or environment
export MCP_LOG_LEVEL=DEBUG  # Options: DEBUG, INFO, WARNING, ERROR, CRITICAL

Log Levels:

DEBUG - Detailed diagnostic information (queries, parameters, API calls)
INFO - General informational messages (default)
WARNING - Warning messages for potential issues
ERROR - Error messages for failures
CRITICAL - Critical issues that prevent operation

What Gets Logged:

Server initialization and configuration loading
Provider initialization and health checks
Tool invocations with parameters
Query execution and results
API calls to observability platforms
Errors and exceptions with stack traces

Example Log Output:

2026-02-13 10:30:15 - mcp_observability.server - INFO - Starting MCP Observability Server
2026-02-13 10:30:15 - mcp_observability.utils - INFO - Loading config from: config.yaml
2026-02-13 10:30:15 - mcp_observability.providers.newrelic - INFO - New Relic provider initialized for region: US
2026-02-13 10:30:20 - mcp_observability.server - INFO - Tool called: query_logs
2026-02-13 10:30:21 - mcp_observability.providers.newrelic - INFO - New Relic query returned 42 log(s)

Troubleshooting

Common Issues

"Provider unhealthy" in health check

Verify credentials are correct in config.yaml
Check environment variables are set
Ensure network connectivity to provider API

"No logs found"

Verify time range includes the period you're interested in
Check that log groups/workspaces are configured correctly
Ensure services are actually logging during the time period

AWS credentials error

If using IAM role, ensure instance has correct permissions
If using access keys, verify they're correct in .env
Check AWS region matches where your logs are

Timeout errors

Increase timeout_seconds in provider config
Reduce limit to fetch fewer results
Check network connectivity

Enable debug logging

Set MCP_LOG_LEVEL=DEBUG in your environment or .env file
Check logs for detailed query information and API responses
Review stack traces for error details

Performance Tips

Specify log groups - specify exact log groups instead of querying all
Use time ranges wisely - Shorter time ranges return faster
Limit results - Start with smaller limits and increase if needed
Filter by service - Reduces data scanned across all platforms
Use recent_errors - Optimized query for error investigation

Security Best Practices

Never commit credentials - Use environment variables or secrets manager
Rotate keys regularly - Set up key rotation for all platforms
Principle of least privilege - Grant only read permissions needed
Audit access - Monitor who's using the MCP server
Secure config files - Restrict file permissions on config.yaml

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

License

MIT License - see LICENSE file for details

Support

Issues: https://github.com/yourusername/mcp-observability-server/issues
Discussions: https://github.com/yourusername/mcp-observability-server/discussions
Documentation: https://docs.example.com/mcp-observability

Roadmap

Support for more providers (Splunk, Elastic, Grafana Loki)
Advanced query builders
Log analytics and pattern detection
Alerting integration
Performance metrics collection
Custom query templates
Multi-account support per provider

Acknowledgments

Built with the Model Context Protocol by Anthropic.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Feb 17, 2026

0.1.2

Feb 17, 2026

This version

0.1.1

Feb 16, 2026

0.1.0

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_observability_server-0.1.1.tar.gz (132.5 kB view details)

Uploaded Feb 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_observability_server-0.1.1-py3-none-any.whl (54.6 kB view details)

Uploaded Feb 16, 2026 Python 3

File details

Details for the file mcp_observability_server-0.1.1.tar.gz.

File metadata

Download URL: mcp_observability_server-0.1.1.tar.gz
Upload date: Feb 17, 2026
Size: 132.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mcp_observability_server-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a92670d52d153751dd8c25a791ccdfab372e2c9d42332d8fd485b1e164f14007`
MD5	`c8e0f612d116f900d75143bf4c8f6743`
BLAKE2b-256	`d4ec7f054e656fbd32c6af3e233d80e0e6d7a12ad704aafa7238ccf3bb6b453b`

See more details on using hashes here.

File details

Details for the file mcp_observability_server-0.1.1-py3-none-any.whl.

File metadata

Download URL: mcp_observability_server-0.1.1-py3-none-any.whl
Upload date: Feb 16, 2026
Size: 54.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mcp_observability_server-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`128ad4ad22be0dccdfd939032028894461ff4ad0e2d21d86d1b4e699c8a4d880`
MD5	`1cd9f59386a289a21be4c2ac675a3a08`
BLAKE2b-256	`692e05f58166fec4c792b04843657a54900ea5c7376fe4e67583cd9fc2fa4b11`

See more details on using hashes here.

mcp-observability-server 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MCP Observability Server

Supported Platforms

Features

Installation

From PyPI

From Source

Configuration

1. Create Configuration File

2. Set Environment Variables

3. Configure Claude Desktop

Usage

Example Queries

Available Tools

query_logs

get_recent_errors

search_by_trace_id

health_check

Guided Workflows (Prompts)

investigate-incident

health-check-report

post-deployment-check

trace-flow-analysis

root-cause-analysis

Platform-Specific Configuration

New Relic

Azure Application Insights

Development

Setup Development Environment

Testing with MCP Inspector

Project Structure

Running Tests

Logging

Troubleshooting

Common Issues

Performance Tips

Security Best Practices

Contributing

License

Support

Roadmap

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`query_logs`

`get_recent_errors`

`search_by_trace_id`

`health_check`

`investigate-incident`

`health-check-report`

`post-deployment-check`

`trace-flow-analysis`

`root-cause-analysis`