Skip to main content

LLM & Agent Observability — structured tracing, Prometheus metrics, and OpenTelemetry export via Python decorators

Project description

Rastir

Rastir

LLM & Agent Observability for Python
One decorator per framework. Full visibility. No monkey-patching.

PyPI Python Docs License GitHub


What is Rastir?

Rastir gives you production observability for LLM agents — token usage, latency percentiles, cost tracking, tool call rates, error categories — as Prometheus metrics with Grafana dashboards.

Add one decorator to your LangGraph, CrewAI, or LlamaIndex workflow. Rastir auto-discovers LLMs, tools, and graph nodes inside the framework and wraps them for per-call tracing. No code rewrites. No vendor lock-in.

from rastir import configure, langgraph_agent

configure(service="my-app", push_url="http://localhost:8080")

@langgraph_agent
def run(query):
    graph = create_react_agent(model, tools)
    return graph.invoke({"messages": [("user", query)]})

That's it. Every LLM call, tool invocation, and node execution inside the graph is now traced with metrics flowing to Prometheus.


Key Features

Feature Description
One decorator per framework @langgraph_agent, @crew_kickoff, @llamaindex_agent — auto-discovers and wraps everything inside
15 provider adapters OpenAI, Azure, Anthropic, Bedrock, Gemini, Cohere, Mistral, Groq, LangChain, LangGraph, LlamaIndex, CrewAI — auto-detected
Two-phase enrichment Model/provider metadata captured from function args before the call, refined from response after. Survives API failures
MCP distributed tracing wrap(session) propagates trace context across MCP tool boundaries — same trace_id links client and server
Cost observability Per-model USD cost tracking with PricingRegistry, pricing profiles, cost histograms
Streaming TTFT Time-To-First-Token measurement on streaming LLM calls
Guardrail tracking Automatic AWS Bedrock guardrail violation metrics
Error normalisation Exceptions mapped to 6 fixed categories: timeout, rate_limit, validation_error, provider_error, internal_error, unknown
Self-hosted collector FastAPI server you own. Prometheus /metrics, in-memory trace store, OTLP export to Tempo/Jaeger
SRE budgets & burn rates Error and cost budget tracking via Prometheus recording rules — SLO status, burn rates, days-to-exhaustion, all config-driven
7 Grafana dashboards LLM Performance, Agent-Tool, Cost-TTFT, Evaluation, Guardrail, SRE Budgets, System Health
Generic wrap() Instrument any object — Redis, databases, MCP sessions — without decorator access

Framework Support at a Glance

LangGraph CrewAI LlamaIndex
Decorator @langgraph_agent @crew_kickoff @llamaindex_agent
Agent span Automatic Automatic Automatic
LLM tracing Auto-discovered Auto-discovered wrap(llm)
Tool tracing Auto-discovered Auto-discovered wrap(tool)
Node tracing Automatic (all nodes) N/A N/A
MCP tools Pass as normal tools Native via mcps=[] on agents wrap() on McpToolSpec tools
User code 1 decorator 1 decorator 1 decorator + wrap() calls

LangGraph

@langgraph_agent(agent_name="react_agent")
def run(graph, query):
    return graph.invoke({"messages": [("user", query)]})
react_agent (AGENT)
  ├── node:agent (TRACE)       ← every graph node traced
  │   └── langgraph.llm.gpt-4o.invoke (LLM)
  ├── node:tools (TRACE)
  │   └── langgraph.tool.search.invoke (TOOL)
  └── node:agent (TRACE)
      └── langgraph.llm.gpt-4o.invoke (LLM)

CrewAI

@crew_kickoff(agent_name="research_crew")
def run(crew):
    return crew.kickoff()
research_crew (AGENT)
  ├── crewai.Researcher.llm.call (LLM) — model, provider, tokens, cost
  ├── crewai.Researcher.tool.search (TOOL) — tool.input, tool.output
  │   └── mcpserver:search (TOOL)       ← server span via traceparent
  ├── crewai.Researcher.llm.call (LLM)
  └── crewai.Writer.llm.call (LLM)

LlamaIndex

from rastir import llamaindex_agent
from llama_index.core.agent import ReActAgent

agent = ReActAgent(llm=llm, tools=tools, streaming=False)

@llamaindex_agent(agent_name="qa_agent")
async def run(agent, query):
    return await agent.run(query)
qa_agent (AGENT)
├── llamaindex.ReActAgent.llm.achat (LLM) — model, provider, tokens, cost
├── search.acall (TOOL)                   — tool.input, tool.output
│   └── mcpserver:search (TOOL)           ← server span via traceparent
├── llamaindex.ReActAgent.llm.achat (LLM)
└── llamaindex.ReActAgent.llm.achat (LLM)

Detailed framework documentation: LangGraph · CrewAI · LlamaIndex


Supported Providers

Provider Auto-detection Tokens Model Streaming Request-phase
OpenAI
Azure OpenAI
Anthropic
AWS Bedrock
Google Gemini
Cohere
Mistral
Groq
LangChain
LangGraph
LlamaIndex
CrewAI

Installation

pip install rastir              # Client library
pip install rastir[server]      # + Collector server
pip install rastir[all]         # Everything

Quick Start

from rastir import configure, agent, llm, trace

configure(service="my-app", push_url="http://localhost:8080")

@agent(agent_name="qa_bot")
def answer(query):
    return ask_llm(search(query))

@trace
def search(query):
    return vector_db.search(query)

@llm
def ask_llm(context):
    return openai.chat.completions.create(model="gpt-4o", messages=[...])

Start the collector:

rastir-server   # Prometheus metrics at :8080/metrics

What You Get in Prometheus

rastir_llm_calls_total{model="gpt-4o", provider="openai", agent="qa_bot"} 150
rastir_tokens_input_total{model="gpt-4o"} 25000
rastir_tokens_output_total{model="gpt-4o"} 8500
rastir_duration_seconds_bucket{span_type="llm", le="1.0"} 120
rastir_errors_total{span_type="llm", error_type="rate_limit"} 3
rastir_cost_total{model="gpt-4o", pricing_profile="prod"} 12.50
rastir_ttft_seconds_bucket{model="gpt-4o", le="0.5"} 95

Architecture

Your Application                             Rastir Collector
┌────────────────────────────────┐           ┌────────────────────────────┐
│  @langgraph_agent              │   HTTP    │  FastAPI                   │
│  @crew_kickoff                 │  ──────▸  │  ├── Prometheus /metrics   │
│  @llamaindex_agent             │   spans   │  ├── Trace store /v1/traces│
│  @agent / @llm                 │           │  ├── Sampling & backpressure│
│  wrap(obj)                     │           │  └── OTLP → Tempo/Jaeger  │
└────────────────────────────────┘           └────────────────────────────┘

Documentation

Full documentation at skamalj.github.io/rastir:

Section Pages
Getting Started Installation & Quick Start
Core Decorators · Adapters · wrap() & MCP · MCP Tracing
Frameworks LangGraph · CrewAI · LlamaIndex
Operations Metrics · Dashboards · Server · Configuration
Reference Architecture · Environment Variables · Contributing Adapters

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rastir-0.1.2.tar.gz (692.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rastir-0.1.2-py3-none-any.whl (131.0 kB view details)

Uploaded Python 3

File details

Details for the file rastir-0.1.2.tar.gz.

File metadata

  • Download URL: rastir-0.1.2.tar.gz
  • Upload date:
  • Size: 692.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for rastir-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c8a17c558e3d0f4cbd065ccad526d72ee4f78012e15f13d7d51b6312b06a5d8b
MD5 56cc9e0b789166380dc2a211557f4cc7
BLAKE2b-256 7d1420c9e4780ec96a92abba440fab6dbc3c18be4ebba103412839368446c436

See more details on using hashes here.

File details

Details for the file rastir-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: rastir-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 131.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for rastir-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e2d978f4e3f80136cc79731846d9f7c03012755d28972ae08875726e8b2327a4
MD5 1421aa5e4dca8559eb7614f0363f53e2
BLAKE2b-256 6e041f8ceaf56836817dd484018e82241494c56f502372a3c9c015b50df32811

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page