Skip to main content

Langfuse MCP server with built-in analytics, multi-project routing, and Google OAuth. Token percentiles, accuracy metrics, failure detection, cost breakdowns, session analytics, latency analysis, context breach scanning — plus a hosted-remote deployment story.

Project description

Langfuse MCP Server

License: MIT Python 3.10+

Model Context Protocol server for Langfuse observability. Query traces, analyze accuracy, detect failures, track costs, debug latency, manage prompts and datasets.

56 tools across data access and analytics. Multi-project support so one instance can serve several Langfuse projects. Works with Claude Code, Codex CLI, Cursor, and any MCP-compatible client.

Why this MCP server?

Comparison with official Langfuse MCP (as of March 2026):

Capability This server Official Langfuse MCP
Traces & Observations Yes No
Sessions & Users Yes No
Exception Tracking Yes No
Prompt Management Yes Yes
Dataset Management Yes No
Annotation Queues Yes No
Scores v2 API Yes No
Score Write-back Yes No
Multi-project support Yes No
Accuracy Metrics Yes No
Failure Detection Yes No
Token Percentiles Yes No
Cost Breakdown Yes No
Latency Analysis Yes No
Session Analytics Yes No
Context Breach Scanning Yes No
User Group Aggregation Yes No

The official MCP focuses on prompt management. This server provides a full observability and analytics toolkit — traces, observations, sessions, scores, exceptions, prompts, datasets, annotation queues, plus 9 built-in analytics tools that compute insights server-side and return LLM-sized summaries. Multi-project routing lets a single instance serve several Langfuse projects behind one connector URL.


Quick Start

1. Get your API keys

  • Langfuse Cloud: cloud.langfuse.com → Settings → API Keys
  • Self-hosted: Your Langfuse instance → Settings → API Keys. Set LANGFUSE_HOST to your instance URL (e.g., https://langfuse.yourcompany.com)

2. Add the MCP server

Claude Code

claude mcp add \
  -e LANGFUSE_PUBLIC_KEY=pk-lf-... \
  -e LANGFUSE_SECRET_KEY=sk-lf-... \
  -e LANGFUSE_HOST=https://cloud.langfuse.com \
  --scope project \
  langfuse-mcp -- uvx langfuse-mcp-server

Codex CLI

codex mcp add langfuse-mcp \
  --env LANGFUSE_PUBLIC_KEY=pk-lf-... \
  --env LANGFUSE_SECRET_KEY=sk-lf-... \
  --env LANGFUSE_HOST=https://cloud.langfuse.com \
  -- uvx langfuse-mcp-server

Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "langfuse-mcp": {
      "command": "uvx",
      "args": ["langfuse-mcp-server"],
      "env": {
        "LANGFUSE_PUBLIC_KEY": "pk-lf-...",
        "LANGFUSE_SECRET_KEY": "sk-lf-...",
        "LANGFUSE_HOST": "https://cloud.langfuse.com"
      }
    }
  }
}

3. Verify

Restart your CLI, then test with /mcp (Claude Code) or codex mcp list (Codex).

Manual install (alternative to uvx)

pip install langfuse-mcp-server
langfuse-mcp-server

Hosting as a remote service

Run as a long-lived HTTP service so multiple users connect to a single instance — required for Claude.ai custom Connectors, and useful for team-wide access without distributing Langfuse API keys per user.

Enabled via env vars; no code changes.

Minimum setup

MCP_TRANSPORT=streamable-http
MCP_BASE_URL=https://mcp.yourcompany.com
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://your-langfuse-instance.example

Without OAuth env vars, the endpoint is unauthenticated — suitable only for local testing. See Google OAuth setup below for production.

Docker

A production-ready Dockerfile is checked into the repo (non-root user, pinned base, .dockerignore to prevent secret leakage). Each tagged release auto-publishes a multi-arch image to GitHub Container Registry via .github/workflows/docker-publish.yml.

Pull the published image:

docker pull ghcr.io/drishtantkaushal/langfusemcp:latest

Or build from source:

docker build -t langfuse-mcp .

Run (all secrets injected via -e, never baked into the image):

docker run -d \
  --name langfuse-mcp \
  --restart unless-stopped \
  -p 8000:8000 \
  -e MCP_TRANSPORT=streamable-http \
  -e MCP_BASE_URL=https://mcp.yourcompany.com \
  -e LANGFUSE_PUBLIC_KEY=pk-lf-... \
  -e LANGFUSE_SECRET_KEY=sk-lf-... \
  -e LANGFUSE_HOST=https://cloud.langfuse.com \
  -e GOOGLE_CLIENT_ID=... \
  -e GOOGLE_CLIENT_SECRET=... \
  -e ALLOWED_EMAIL_DOMAINS=yourcompany.com \
  ghcr.io/drishtantkaushal/langfusemcp:latest

Reverse proxy

Terminate TLS in front (nginx, Caddy, Cloudflare). MCP endpoint is at /mcp/ (trailing slash). Because responses stream, the proxy must:

  • Disable response buffering — nginx: proxy_buffering off;
  • Allow read timeout ≥ 5 minutes — some analytics queries legitimately run several minutes
  • Speak HTTP/1.1 with keepalive upstream

Google OAuth

In your Google Cloud project:

  1. APIs & Services → OAuth consent screen
    • User type: Internal (restricts sign-in to your Google Workspace domain)
    • Scopes: openid, https://www.googleapis.com/auth/userinfo.email
  2. Credentials → Create OAuth client ID → Web application
    • Authorized redirect URI: https://{your-base-url}/auth/callback
    • Copy the Client ID and Client Secret

Set:

GOOGLE_CLIENT_ID=....apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-...

OAuth activates when GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, and MCP_BASE_URL are all set. With an Internal consent screen, Google rejects non-Workspace sign-ins at the identity layer — the server never sees those attempts.

Optional email allowlist

For narrower control than "anyone in the Workspace":

# either, or both
ALLOWED_EMAIL_DOMAINS=yourcompany.com
ALLOWED_EMAILS=alice@yourcompany.com,bob@yourcompany.com

When set, every tool call verifies the caller's email_verified claim and checks membership before proceeding. When unset, the server trusts whatever the OAuth provider returns.

Adding to Claude.ai

Once hosted at https://mcp.yourcompany.com:

  1. Claude.ai → Settings → Connectors → Add custom connector
  2. Remote MCP server URL: https://mcp.yourcompany.com/mcp/
  3. Leave the OAuth Client ID / Secret fields empty — the server uses Dynamic Client Registration; those fields are for a different deployment pattern.
  4. Click Add → Google sign-in popup → done.

Verifying the deploy

Auth enabled, expect 401:

curl -i -X POST https://mcp.yourcompany.com/mcp/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'

OAuth metadata endpoint returns JSON (used by Claude.ai to auto-register):

curl https://mcp.yourcompany.com/.well-known/oauth-authorization-server

Liveness/readiness probe — unauthenticated GET /health returns HTTP 200 with {"status": "ok"}, suitable for Kubernetes probes:

curl -i https://mcp.yourcompany.com/health

Multi-project support

A single server instance can route to multiple Langfuse projects. Every tool accepts an optional project argument; when omitted, the server-configured default is used. Call list_projects to discover what's available.

Configuring projects

Declare each project via indexed env vars. Project names are data, not part of variable names — use whatever scheme you like.

LANGFUSE_PROJECT_1_NAME=production
LANGFUSE_PROJECT_1_PUBLIC_KEY=pk-lf-...
LANGFUSE_PROJECT_1_SECRET_KEY=sk-lf-...
LANGFUSE_PROJECT_1_HOST=https://cloud.langfuse.com

LANGFUSE_PROJECT_2_NAME=staging
LANGFUSE_PROJECT_2_PUBLIC_KEY=pk-lf-...
LANGFUSE_PROJECT_2_SECRET_KEY=sk-lf-...
LANGFUSE_PROJECT_2_HOST=https://cloud.langfuse.com

LANGFUSE_DEFAULT_PROJECT=production

Usage from the client

Claude: "Show me failing traces in production today."
→ fetch_traces(project="production", ...) routed to project 1's credentials.

Claude: "Compare that with staging."
→ fetch_traces(project="staging", ...) routed to project 2's credentials.

Each project has its own cache, rate limiter, and connection pool. Claude.ai sees one connector; users authenticate once via OAuth and can query any configured project within the session.

Single-project (legacy) mode

If LANGFUSE_PROJECT_1_NAME is not set, the server falls back to the legacy LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_HOST vars and registers them as a project called default. Existing deployments keep working without changes.


Configuration

Env Variable Default Description
LANGFUSE_PUBLIC_KEY (required) Langfuse public API key
LANGFUSE_SECRET_KEY (required) Langfuse secret API key
LANGFUSE_HOST https://cloud.langfuse.com Langfuse instance URL (cloud or self-hosted)
LANGFUSE_INTERNAL_DOMAINS "" Comma-separated internal domains to exclude from analytics (e.g., mycompany.com,test.com). Applies when using group_by='domain'.
LANGFUSE_MCP_READ_ONLY false Disable write operations (score_traces, create_dataset, etc.)
LANGFUSE_PAGE_LIMIT 100 Traces per API page
LANGFUSE_PROJECT_{N}_NAME (unset) Multi-project: name for project N (e.g. production). See Multi-project.
LANGFUSE_PROJECT_{N}_PUBLIC_KEY (unset) Public key for project N.
LANGFUSE_PROJECT_{N}_SECRET_KEY (unset) Secret key for project N.
LANGFUSE_PROJECT_{N}_HOST https://cloud.langfuse.com Host URL for project N.
LANGFUSE_DEFAULT_PROJECT first configured Default project name used when a tool call omits project.
MCP_TRANSPORT stdio stdio or streamable-http. HTTP mode listens on a port instead of stdin/stdout. See Hosting.
MCP_HOST 0.0.0.0 Bind address when MCP_TRANSPORT=streamable-http.
MCP_PORT 8000 Port when MCP_TRANSPORT=streamable-http.
MCP_BASE_URL (unset) Public base URL of the hosted server. Required for Google OAuth.
GOOGLE_CLIENT_ID (unset) Google OAuth client ID. OAuth activates when all three Google vars are set.
GOOGLE_CLIENT_SECRET (unset) Google OAuth client secret.
ALLOWED_EMAILS (unset) Comma-separated emails allowed to call tools. Requires OAuth.
ALLOWED_EMAIL_DOMAINS (unset) Comma-separated email domains allowed to call tools. Requires OAuth.

Tools

Analytics (9 tools)

Tools that compute insights server-side and return compact summaries. These go beyond raw data access — they aggregate, detect patterns, and compute statistics so the LLM can reason over results without hitting context window limits.

Tool Description Key Parameters
aggregate_by_group Aggregate trace metrics by user group. Returns per-group: trace count, unique sessions, unique users, accuracy rate, average latency, total cost. group_by (name/userId/domain/tag), time_range, top_n, exclude_internal
compute_accuracy Compute accuracy from feedback scores. Accuracy = correct / (correct + incorrect). Supports grouping and time bucketing for trend analysis. group_by, bucket_by (week/day), score_name, time_range
detect_failures Detect LLM output quality failures using pattern matching ("unable to", "I can't", etc.) and negative feedback scores. NOT Python exceptions — use find_exceptions for those. group_by, include_examples, max_examples, time_range
compute_token_percentiles Compute token usage percentiles (TP50/TP90/TP95/TP99) at trace level. Fetches generation observations for accurate per-trace token counts. group_by, percentiles, time_range
detect_context_breaches Scan for traces where any single generation exceeds a token threshold. Catches context window overflow causing degraded LLM performance or silent truncation. threshold (default 256000), check_per_generation, time_range
analyze_sessions Analyze multi-turn session behavior. Returns session count, depth distribution (single vs multi-turn), engagement metrics, and session-level cost/latency. group_by, time_range
estimate_costs Compute cost breakdown using Langfuse's built-in totalCost field (model-aware, computed by Langfuse). Groups by user, agent, or time bucket. group_by, bucket_by (week/day), time_range
analyze_latency Analyze latency distribution at trace level and optionally per LLM generation. Identifies which model is the bottleneck. group_by, percentiles, include_per_generation, time_range
score_traces Write scores back to Langfuse. Use after analysis to annotate traces with findings — tag failures for review, mark high-quality traces for dataset creation. trace_ids, score_name, score_value, comment

Data Access (25 tools)

Full Langfuse API coverage for querying and managing your observability data.

Traces

Tool Description
fetch_traces List traces with filters — user ID, name, tags, time range, ordering. Returns paginated results.
fetch_trace Get a single trace by ID with full details including all observations (spans, generations, events).
diff_traces Compare two traces side-by-side (name, user, latency, cost, tags, release, version).

Observations

Tool Description
fetch_observations List observations with filters — trace ID, type (GENERATION/SPAN/EVENT), name, time range.
fetch_observation Get a single observation by ID. Returns input/output, token usage, model, latency, and cost.

Sessions

Tool Description
fetch_sessions List sessions with optional time filters.
get_session_details Get full details of a session including all its traces.
get_user_sessions Get sessions for a specific user. Fetches user's traces and extracts unique sessions.

Errors

Tool Description
find_exceptions Find observations with error status. For LLM output quality issues, use detect_failures instead.
get_exception_details Get full error details for a trace — returns all observations with error status highlighted.
get_error_count Get total error count within a time period.

Scores

Tool Description
fetch_scores List scores/evaluations with filters — trace ID, score name, time range.
list_scores_v2 v2 Scores API with richer filters (session ID, dataset run ID, queue ID, config ID, operator/value, etc.).
get_score_v2 Get a single score by ID via the v2 Scores API.

Prompts

Tool Description
list_prompts List all prompts in the project with optional name filter.
get_prompt Fetch a specific prompt by name, version, or label.
get_prompt_unresolved Fetch a prompt with placeholders/dependencies intact (debugging prompt composition).
create_text_prompt Create a new text prompt version with optional labels and model config.
create_chat_prompt Create a new chat prompt version with message array and optional config.
update_prompt_labels Update labels for a specific prompt version (e.g., promote to "production").

Datasets

Tool Description
list_datasets List all datasets in the project.
get_dataset Get metadata for a specific dataset.
list_dataset_items List items in a dataset with pagination.
get_dataset_item Get a single dataset item by ID.
create_dataset Create a new dataset with optional description and metadata.
create_dataset_item Create or upsert a dataset item. Supports linking to source traces.
delete_dataset_item Delete a dataset item by ID.

Annotation Queues

Tool Description
list_annotation_queues List all annotation queues in the project.
create_annotation_queue Create a new annotation queue with attached score configs.
get_annotation_queue Get a queue by ID.
list_annotation_queue_items List items in a queue (optionally filtered by status).
get_annotation_queue_item Get a queue item by ID.
create_annotation_queue_item Add a trace or observation to a queue for review.
update_annotation_queue_item Change a queue item's status (PENDING / COMPLETED).
delete_annotation_queue_item Remove an item from a queue.
create_annotation_queue_assignment Assign a reviewer to a queue.
delete_annotation_queue_assignment Remove a reviewer from a queue.

Metrics

Tool Description
get_daily_metrics Langfuse's pre-aggregated daily rollup (trace count, cost, tokens per day). Faster than per-trace aggregation for long windows.

Users

Tool Description
list_users Top users by trace count over a time window (defaults to last 30 days). Wraps the Langfuse /metrics query API.

Comments

Tool Description
list_comments List comments attached to traces/observations/sessions/prompts, with filters.
get_comment Get a single comment by ID.
create_comment Create a markdown comment on a trace/observation/session/prompt.

Models

Tool Description
list_models List model definitions in Langfuse's models registry (pricing + tokenizer config).
get_model Get a single model definition by ID.

Projects

Tool Description
list_projects Discovery: returns the list of configured Langfuse projects and the default project.

Schema

Tool Description
get_data_schema Get the data schema for the Langfuse project — available fields and types for traces, observations, scores, sessions.

Sample Questions

Once connected, ask your AI assistant questions like these:

Agent & Pipeline Health

  • "Which agents failed the most this week?"
  • "What's the failure rate by agent name?"
  • "Which agent has the worst accuracy?"
  • "Show me the top 5 agents by trace volume"
  • "Are any agents consistently slower than others?"
  • "Compare all agents by accuracy, latency, and cost"

Accuracy & Quality

  • "What's our overall accuracy this week?"
  • "What's the accuracy trend by week for the last 30 days?"
  • "Compare accuracy across different agents"
  • "What's the daily accuracy breakdown?"
  • "Which users are getting the worst accuracy?"
  • "What percentage of traces have feedback scores?"

Failures & Debugging

  • "Show me failure examples from today"
  • "What are the most common failure patterns?"
  • "Which users are seeing the most failures?"
  • "What's the failure rate by agent?"
  • "Are failures increasing or decreasing this week vs last?"
  • "Show me traces where the LLM said 'unable to' or 'I can't'"

Token Usage

  • "What are the P90 and P99 token usage stats?"
  • "Which agents consume the most tokens?"
  • "Compare token usage across user groups"
  • "Are any users hitting unusually high token counts?"

Context Window Breaches

  • "Are any generations exceeding the 128K context window?"
  • "Show me traces with token usage above 200K per generation"
  • "What's the breach severity distribution?"
  • "Which users trigger the most context window breaches?"

Sessions & Engagement

  • "What's our multi-turn rate?"
  • "How deep are sessions on average?"
  • "Which users have the deepest sessions?"
  • "How many single-turn vs multi-turn sessions this week?"
  • "What's the average session cost?"

Cost

  • "How much are we spending per day this week?"
  • "What's the weekly cost trend for the last 30 days?"
  • "Which agent is the most expensive?"
  • "Which users are costing the most?"
  • "What's the average cost per trace?"

Latency

  • "What's the P95 latency?"
  • "Is latency getting worse over time?"
  • "Which model is the slowest?"
  • "Compare latency across agents"
  • "Show me per-generation latency breakdown by model"
  • "Which users are experiencing the highest latency?"

Annotation & Write-back

  • "Score all failing traces from today with 'needs-review'"
  • "Tag these trace IDs as 'high-quality' for dataset creation"
  • "Mark trace abc-123 with a score of 0 and comment 'hallucinated output'"

Lookups & Exploration

  • "Fetch the last 20 traces"
  • "Show me trace abc-123 with all its observations"
  • "List sessions for user alice@example.com"
  • "What errors happened in the last 24 hours?"
  • "How many errors occurred this week?"
  • "Show me all prompts in the project"
  • "List all datasets"
  • "What fields are available on traces and observations?"

Grouping Options

The group_by parameter controls how traces are segmented in analytics tools:

Value What it groups by When to use
name Trace/agent name (default) Compare performance across different agents or pipelines
userId Per-user breakdown Identify users with issues or high usage
domain Email domain extracted from userId Multi-tenant apps where users have email-based IDs (e.g., user@acme.comacme.com)
tag Trace tags Compare across tagged environments, versions, or experiments

Selective Tool Loading

Load only the tool groups you need to reduce token overhead:

# Only load traces and analytics tools
LANGFUSE_TOOLS=traces,analytics langfuse-mcp-server

# Only load prompts and datasets
LANGFUSE_TOOLS=prompts,datasets langfuse-mcp-server

# In Claude Code
claude mcp add \
  -e LANGFUSE_PUBLIC_KEY=pk-lf-... \
  -e LANGFUSE_SECRET_KEY=sk-lf-... \
  -e LANGFUSE_TOOLS=traces,observations,analytics \
  langfuse-mcp -- uvx langfuse-mcp-server

Available groups:

Group Tools Count
traces fetch_traces, fetch_trace 2
observations fetch_observations, fetch_observation 2
sessions fetch_sessions, get_session_details, get_user_sessions 3
errors find_exceptions, get_exception_details, get_error_count 3
scores fetch_scores 1
prompts list_prompts, get_prompt, create_text_prompt, create_chat_prompt, update_prompt_labels 5
datasets list_datasets, get_dataset, list_dataset_items, get_dataset_item, create_dataset, create_dataset_item, delete_dataset_item 7
annotation_queues All 10 annotation queue tools 10
metrics get_daily_metrics 1
users list_users 1
comments list_comments, get_comment, create_comment 3
models list_models, get_model 2
projects list_projects 1
schema get_data_schema 1
analytics All 9 analytics tools 9

If LANGFUSE_TOOLS is not set, all 56 tools are loaded.


Read-Only Mode

Disable write operations (score_traces, create_dataset, create_dataset_item, delete_dataset_item, create_text_prompt, create_chat_prompt):

LANGFUSE_MCP_READ_ONLY=true

How it Compares

vs Official Langfuse MCP

Capability This server Official Langfuse MCP
Traces & Observations Yes No
Sessions & Users Yes No
Exception Tracking Yes No
Prompt Management Yes Yes
Dataset Management Yes No
Score Write-back Yes No
Selective Tool Loading Yes No
Accuracy Metrics Yes No
Failure Detection Yes No
Token Percentiles Yes No
Cost Breakdown Yes No
Latency Analysis Yes No
Session Analytics Yes No
Context Breach Scanning Yes No
User Group Aggregation Yes No

The official Langfuse MCP (5 tools) focuses on prompt management. This server provides full observability coverage plus 9 analytics tools.

vs Other Langfuse MCP Implementations

Capability This server Others
Data access (traces, observations, sessions) Yes Yes
Prompt & dataset management Yes Yes
Exception tracking Yes Yes
Annotation queues Yes Partial
Selective tool loading Yes Yes
Multi-project support Yes No
Accuracy metrics Yes No
LLM failure detection Yes No
Token percentiles (TP50/P90/P95/P99) Yes No
Cost breakdown by group/time Yes No
Latency analysis with per-model breakdown Yes No
Multi-turn session analytics Yes No
Context window breach scanning Yes No
User/tenant group aggregation Yes No
Score write-back Yes No

Other implementations provide data access (fetching raw traces, observations, sessions) using synchronous HTTP clients. This server adds a compute layer — analytics tools that aggregate, detect patterns, and compute statistics server-side — plus an async architecture that's fundamentally faster.

Architecture This server Others
Async HTTP client Yes (httpx.AsyncClient) No (sync requests/httpx)
Concurrent observation fetching Yes (asyncio.gather) No (sequential per-trace)
TTL caching Yes (live 5min, historical 1hr) No
Adaptive rate limiting Yes (token bucket, 429 backoff) No (fixed sleep)
Batch observation queries Yes (with auto-fallback) No (N+1 per-trace)
Claude Code sub-agent Yes (.claude/agents/) No

vs Platform-Embedded AI (Braintrust Loop, LangSmith Insights, Arize Alyx)

Capability This server Platform AI assistants
Open source Yes No
Works with any MCP client Yes Platform-locked
Self-hosted Langfuse support Yes N/A
Real-time conversational Yes Varies (some batch-only)
Custom grouping/segmentation Yes Limited
Write-back to Langfuse Yes Platform-specific
Free Yes Paid tiers

Architecture

Why async httpx instead of the Langfuse SDK?

The Langfuse Python SDK is excellent for writing traces (it batches and sends asynchronously in the background). But for reading traces at scale — which is what an analytics MCP server does — the SDK has a limitation: its read API is synchronous, built on the requests library.

This server uses httpx.AsyncClient instead, which enables:

  • Concurrent observation fetching — fetch observations for 100 traces simultaneously via asyncio.gather, not one-by-one
  • Non-blocking pagination — paginate through thousands of traces without blocking the event loop
  • Rate-limited concurrencyasyncio.Semaphore + token bucket controls throughput without time.sleep() blocking

Measured impact: analyze_latency with per-generation breakdown dropped from 110s to 20s (5.4x faster) on a self-hosted instance with 2.4M daily observations.

Caching strategy

Two-tier in-memory TTL cache using cachetools.TTLCache:

Data age TTL Rationale
Today's data 5 minutes Still changing, short cache
Historical data (before today) 1 hour Won't change, cache aggressively

The cache operates at the API page level. If you call aggregate_by_group then compute_accuracy for the same time range, the second call hits cache for all trace pages — only scores are fetched fresh.

Configure via LANGFUSE_CACHE_TTL and LANGFUSE_CACHE_TTL_HISTORICAL (seconds).

Rate limiting

A global token bucket rate limiter respects Langfuse API limits:

Instance type Default RPM Behavior
Self-hosted Unlimited (0) No artificial throttling. Full speed, limited only by your server.
Langfuse Cloud (Hobby) 30 req/min Conservative default for Hobby tier
Langfuse Cloud (Pro/Team) Set LANGFUSE_RATE_LIMIT_RPM=1000 Higher throughput for paid plans

On HTTP 429 responses, the limiter automatically halves the RPM and reads the Retry-After header. This means the server adapts to any rate limit — cloud or self-hosted — without manual configuration.

Observation fetching: batch vs concurrent

Analytics tools that need per-generation data (token percentiles, context breaches, latency breakdown) face the N+1 problem: one API call per trace to fetch its observations.

This server uses a two-step strategy:

  1. Try batch fetch — fetch ALL observations for the time range in one paginated call, group by traceId in memory
  2. If volume is too high (>5000 pages / 500K+ observations) — fall back to concurrent per-trace fetch using asyncio.gather with semaphore-controlled concurrency

This means the server handles both small projects (batch is faster) and large-scale deployments (concurrent targeted fetching avoids downloading millions of irrelevant observations).

Context isolation via sub-agent

The server ships with a Claude Code custom agent at .claude/agents/langfuse-analyst.md. When a user asks a Langfuse-related question, Claude Code can delegate to this agent, which:

  • Only loads Langfuse MCP tools (not other tools in the session)
  • Has a specialized system prompt with tool taxonomy and workflow patterns
  • Runs in an isolated context window, keeping the main conversation clean
  • Returns a summary to the parent conversation

This prevents 33 tool schemas (~5000 tokens) from polluting every conversation.


Contributing

See CONTRIBUTING.md for development setup, code style guidelines, and areas for contribution.

Security

See SECURITY.md for the security policy, vulnerability reporting, and API key handling.

Code of Conduct

See CODE_OF_CONDUCT.md.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langfuse_mcp_server-0.1.0.tar.gz (48.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langfuse_mcp_server-0.1.0-py3-none-any.whl (37.8 kB view details)

Uploaded Python 3

File details

Details for the file langfuse_mcp_server-0.1.0.tar.gz.

File metadata

  • Download URL: langfuse_mcp_server-0.1.0.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langfuse_mcp_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3f23f15ddda4f844cd3fa46222acc1462fa762e5cffb35668a98060a892aa7a8
MD5 ecb64d6dc9468ac32534f061526e7d36
BLAKE2b-256 4872cda2fac3096d89e1755dc29582d02b93c69782ee41c6e34eddf8670f0c35

See more details on using hashes here.

Provenance

The following attestation bundles were made for langfuse_mcp_server-0.1.0.tar.gz:

Publisher: publish-to-pypi.yml on DrishtantKaushal/LangfuseMCP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langfuse_mcp_server-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langfuse_mcp_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c922c62fe812bb26d27d4f1ff881ddbdcde9e93a90430929cbed20fce123cd1e
MD5 7f0c5afcd4149551e222cd9edc887807
BLAKE2b-256 5dd027a5e6853662c00e89e2ad22d69309c2c411aa4645163850ab013a61e5cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for langfuse_mcp_server-0.1.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on DrishtantKaushal/LangfuseMCP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page