LLM & Agent Observability — structured tracing, Prometheus metrics, and OpenTelemetry export via Python decorators
Project description
Rastir
LLM & Agent Observability for Python
One decorator per framework. Full visibility. No monkey-patching.
What is Rastir?
Rastir gives you production observability for LLM agents — token usage, latency percentiles, cost tracking, tool call rates, error categories — as Prometheus metrics with Grafana dashboards.
Add one decorator to your LangGraph, CrewAI, LlamaIndex, ADK, or Strands workflow. Rastir auto-discovers LLMs, tools, and graph nodes inside the framework and wraps them for per-call tracing. No code rewrites. No vendor lock-in.
from rastir import configure, framework_agent
configure(service="my-app", push_url="http://localhost:8080")
@framework_agent
def run(graph_or_agent, prompt):
return graph_or_agent.invoke(prompt) # Works with any supported framework
That's it. @framework_agent auto-detects the framework from function arguments and instruments everything inside. Every LLM call, tool invocation, and node execution is now traced with metrics flowing to Prometheus.
You can also use framework-specific decorators for explicit control: @langgraph_agent, @crew_kickoff, @llamaindex_agent, @adk_agent, @strands_agent.
Key Features
| Feature | Description |
|---|---|
| One decorator per framework | @framework_agent (auto-detect), @langgraph_agent, @crew_kickoff, @llamaindex_agent, @adk_agent, @strands_agent |
| 8 provider adapters | OpenAI, Azure OpenAI, Anthropic, Bedrock, Gemini, Cohere, Mistral, Groq — auto-detected from client module paths |
| Two-phase enrichment | Model/provider metadata captured from function args before the call, refined from response after. Survives API failures |
| MCP distributed tracing | wrap(session) propagates trace context across MCP tool boundaries — same trace_id links client and server |
| Cost observability | Per-model USD cost tracking with PricingRegistry, pricing profiles, cost histograms |
| Streaming TTFT | Time-To-First-Token measurement on streaming LLM calls |
| Guardrail tracking | Automatic AWS Bedrock guardrail violation metrics |
| Error normalisation | Exceptions mapped to 6 fixed categories: timeout, rate_limit, validation_error, provider_error, internal_error, unknown |
| Self-hosted collector | FastAPI server you own. Prometheus /metrics, in-memory trace store, OTLP export to Tempo/Jaeger |
| SRE budgets & burn rates | Error and cost budget tracking via Prometheus recording rules — SLO status, burn rates, days-to-exhaustion, all config-driven |
| 7 Grafana dashboards | LLM Performance, Agent-Tool, Cost-TTFT, Evaluation, Guardrail, SRE Budgets, System Health |
Generic wrap() |
Instrument any object — Redis, databases, MCP sessions — without decorator access |
Framework Support at a Glance
All five frameworks work with @framework_agent (auto-detects the framework) or the dedicated decorator:
| LangGraph | CrewAI | LlamaIndex | ADK | Strands | |
|---|---|---|---|---|---|
| Decorator | @langgraph_agent |
@crew_kickoff |
@llamaindex_agent |
@adk_agent |
@strands_agent |
| Agent span | Automatic | Automatic | Automatic | Automatic | Automatic |
| LLM tracing | Auto-discovered | Auto-discovered | Auto-discovered | Auto-discovered | Auto-discovered |
| Tool tracing | Auto-discovered | Auto-discovered | Auto-discovered | Auto-discovered | Auto-discovered |
| Node tracing | Automatic (all nodes) | N/A | N/A | N/A | N/A |
| MCP tools | Pass as normal tools | Native via mcps=[] |
MCP tools auto-wrapped | Auto-discovered | Auto-discovered |
| User code | 1 decorator | 1 decorator | 1 decorator | 1 decorator | 1 decorator |
LangGraph
@langgraph_agent(agent_name="react_agent")
def run(graph, query):
return graph.invoke({"messages": [("user", query)]})
react_agent (AGENT)
├── node:agent (TRACE) ← every graph node traced
│ └── langgraph.llm.gpt-4o.invoke (LLM)
├── node:tools (TRACE)
│ └── langgraph.tool.search.invoke (TOOL)
└── node:agent (TRACE)
└── langgraph.llm.gpt-4o.invoke (LLM)
CrewAI
@crew_kickoff(agent_name="research_crew")
def run(crew):
return crew.kickoff()
research_crew (AGENT)
├── crewai.Researcher.llm.call (LLM) — model, provider, tokens, cost
├── crewai.Researcher.tool.search (TOOL) — tool.input, tool.output
│ └── mcpserver:search (TOOL) ← server span via traceparent
├── crewai.Researcher.llm.call (LLM)
└── crewai.Writer.llm.call (LLM)
LlamaIndex
from rastir import llamaindex_agent
from llama_index.core.agent import ReActAgent
agent = ReActAgent(llm=llm, tools=tools, streaming=False)
@llamaindex_agent(agent_name="qa_agent")
async def run(agent, query):
return await agent.run(query)
qa_agent (AGENT)
├── llamaindex.ReActAgent.llm.achat (LLM) — model, provider, tokens, cost
├── search.acall (TOOL) — tool.input, tool.output
│ └── mcpserver:search (TOOL) ← server span via traceparent
├── llamaindex.ReActAgent.llm.achat (LLM)
└── llamaindex.ReActAgent.llm.achat (LLM)
ADK
@adk_agent(agent_name="weather_agent")
async def run(runner, prompt):
events = []
async for event in runner.run_async(user_id="u1", session_id="s1",
new_message=types.Content(role="user", parts=[types.Part(text=prompt)])):
events.append(event)
return events
weather_agent (AGENT)
├── LLM gemini-2.0-flash
├── TOOL get_weather
└── LLM gemini-2.0-flash
Strands
@strands_agent(agent_name="research_agent")
def run(agent, prompt):
return agent(prompt)
research_agent (AGENT)
├── LLM us.anthropic.claude-sonnet-4-20250514
├── TOOL search_tool
└── LLM us.anthropic.claude-sonnet-4-20250514
→ Detailed framework documentation: LangGraph · CrewAI · LlamaIndex · ADK · Strands
Supported Providers
| Provider | Auto-detection | Tokens | Model | Streaming | Request-phase |
|---|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic | ✅ | ✅ | ✅ | ✅ | ✅ |
| AWS Bedrock | ✅ | ✅ | ✅ | ✅ | ✅ |
| Google Gemini | ✅ | ✅ | ✅ | ✅ | ✅ |
| Cohere | ✅ | ✅ | ✅ | — | ✅ |
| Mistral | ✅ | ✅ | ✅ | ✅ | ✅ |
| Groq | ✅ | ✅ | ✅ | ✅ | ✅ |
Providers are auto-detected from LLM client module paths — no configuration needed. Each provider adapter extracts model name, token counts, and cost from the provider's native response format.
Installation
pip install rastir # Client library
pip install rastir[server] # + Collector server
pip install rastir[all] # Everything
Quick Start
from rastir import configure, agent, llm, trace
configure(service="my-app", push_url="http://localhost:8080")
@agent(agent_name="qa_bot")
def answer(query):
return ask_llm(search(query))
@trace
def search(query):
return vector_db.search(query)
@llm
def ask_llm(context):
return openai.chat.completions.create(model="gpt-4o", messages=[...])
Start the collector:
rastir-server # Prometheus metrics at :8080/metrics
What You Get in Prometheus
rastir_llm_calls_total{model="gpt-4o", provider="openai", agent="qa_bot"} 150
rastir_tokens_input_total{model="gpt-4o"} 25000
rastir_tokens_output_total{model="gpt-4o"} 8500
rastir_duration_seconds_bucket{span_type="llm", le="1.0"} 120
rastir_errors_total{span_type="llm", error_type="rate_limit"} 3
rastir_cost_total{model="gpt-4o", pricing_profile="prod"} 12.50
rastir_ttft_seconds_bucket{model="gpt-4o", le="0.5"} 95
Architecture
Your Application Rastir Collector
┌────────────────────────────────┐ ┌────────────────────────────┐
│ @framework_agent (auto-detect)│ HTTP │ FastAPI │
│ @langgraph_agent / @adk_agent │ ──────▸ │ ├── Prometheus /metrics │
│ @crew_kickoff / @strands_agent│ spans │ ├── Trace store /v1/traces│
│ @llamaindex_agent │ │ ├── Sampling & backpressure│
│ @agent / @llm / wrap(obj) │ │ └── OTLP → Tempo/Jaeger │
└────────────────────────────────┘ └────────────────────────────┘
Deployment
Rastir ships with ready-to-use deployment for local dev, 3 clouds (AWS, Azure, GCP), and Kubernetes:
| Target | Tool | Command |
|---|---|---|
| Local | Docker Compose | cd deploy/docker && ./deploy.sh |
| AWS | Terraform (ECS Fargate) | cd deploy/terraform/aws && ./deploy.sh |
| Azure | Terraform (ACI) | cd deploy/terraform/azure && ./deploy.sh |
| GCP | Terraform (Cloud Run) | cd deploy/terraform/gcp && ./deploy.sh |
| Kubernetes | Helm | cd deploy/k8s && ./deploy.sh |
Each deployment includes the full stack: Rastir Server + OTel Collector + Prometheus + Grafana. Traces go to Tempo (local/k8s), X-Ray (AWS), Application Insights (Azure), or Cloud Trace (GCP).
See Deployment Guide for details.
Documentation
Full documentation at skamalj.github.io/rastir:
| Section | Pages |
|---|---|
| Getting Started | Installation & Quick Start |
| Core | Decorators · Adapters · wrap() & MCP · MCP Tracing |
| Frameworks | LangGraph · CrewAI · LlamaIndex · ADK · Strands |
| Operations | Metrics · Dashboards · Server · Configuration |
| Deployment | Docker Compose · AWS · Azure · GCP · Kubernetes |
| Reference | Architecture · Environment Variables · Contributing Adapters |
License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rastir-0.1.3.tar.gz.
File metadata
- Download URL: rastir-0.1.3.tar.gz
- Upload date:
- Size: 633.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16708cd65f32d2e23797573e2c17fc23bc370f566b8a69438e18d30efc185578
|
|
| MD5 |
66e6cf02b3ed22eb2cd7288da55db39b
|
|
| BLAKE2b-256 |
f3d0fffdd55c79412cb18f33843fb460703f624016fd816d9cbd75b506a4fd56
|
File details
Details for the file rastir-0.1.3-py3-none-any.whl.
File metadata
- Download URL: rastir-0.1.3-py3-none-any.whl
- Upload date:
- Size: 136.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c63abd0198b47e247f0ec5d7109cc8e3823f35ad8c42828e08856917a58d8fbb
|
|
| MD5 |
1a558a851c75c8d51181121f50184bcc
|
|
| BLAKE2b-256 |
d3bd31bdc806708d088ca9e0a4f3495ba1ef7dfa5b376af82906715262565413
|