LLM & Agent Observability — structured tracing, Prometheus metrics, and OpenTelemetry export via Python decorators
Project description
Rastir
LLM & Agent Observability for Python
Structured tracing and Prometheus metrics via decorators — no monkey-patching, no vendor lock-in.
Why Rastir?
Most LLM observability tools require SDK wrappers, monkey-patching, or vendor-specific clients. Rastir takes a different approach:
- Decorators, not wrappers — add
@llm,@agent,@toolto your existing functions. No code rewrites. - Adapters, not monkey-patches — Rastir inspects return values to extract model, tokens, and provider metadata. Works with any SDK version.
- Self-hosted collector — a lightweight FastAPI server you own. Prometheus metrics out of the box, OTLP export to Tempo/Jaeger if you want it.
- Zero external infrastructure — no database, no Redis, no Kafka. The collector is stateless and runs in a single container.
Your Python App Rastir Collector
┌──────────────────────┐ HTTP POST ┌──────────────────────────┐
│ @agent │ ───────────────▸ │ FastAPI ingestion │
│ @llm (OpenAI) │ span batches │ ├─ Prometheus /metrics │
│ @tool (search) │ │ ├─ Trace store /traces │
│ @retrieval (RAG) │ │ └─ OTLP → Tempo/Jaeger │
└──────────────────────┘ └──────────────────────────┘
decorators collector server
Supported Providers
| Provider | Auto-detection | Tokens | Model | Streaming |
|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | ✅ |
| Anthropic | ✅ | ✅ | ✅ | ✅ |
| AWS Bedrock | ✅ | ✅ | ✅ | ✅ |
| LangChain | ✅ | ✅ | ✅ | ✅ |
| LangGraph | ✅ | ✅ | ✅ | ✅ |
Adapters are priority-ordered and composable: LangGraph → LangChain → OpenAI resolution happens automatically.
Installation
pip install rastir # Client library (decorators + HTTP push)
pip install rastir[server] # + Collector server (FastAPI, Prometheus, OTLP)
pip install rastir[all] # Everything including dev tools
Quick Start
1. Instrument your code (3 lines to add)
from rastir import configure, agent, llm, tool, retrieval
configure(
service="my-app",
push_url="http://localhost:8080/v1/telemetry",
)
@agent(agent_name="research_agent")
def run_research(query: str) -> str:
context = fetch_docs(query)
return ask_llm(query, context)
@retrieval
def fetch_docs(query: str) -> list[str]:
return vector_db.search(query) # auto-tracked
@llm(model="gpt-4o", provider="openai")
def ask_llm(query: str, context: list[str]) -> str:
return openai.chat(messages=[...]) # tokens & model extracted automatically
2. Start the collector
rastir-server # default: 0.0.0.0:8080
# or
docker run -p 8080:8080 rastir-server
3. Query metrics
curl http://localhost:8080/metrics # Prometheus format
curl http://localhost:8080/v1/traces # JSON trace store
That's it. Prometheus scrapes /metrics, you build Grafana dashboards, and optionally forward spans to Tempo or Jaeger via OTLP.
What you get in Prometheus
# Token usage by model
rastir_tokens_input_total{model="gpt-4o",provider="openai",agent="research_agent"} 1250
rastir_tokens_output_total{model="gpt-4o",provider="openai",agent="research_agent"} 380
# Latency percentiles
rastir_duration_seconds_bucket{span_type="llm",le="0.5"} 12
rastir_duration_seconds_bucket{span_type="llm",le="1.0"} 45
# Tool & retrieval call rates
rastir_tool_calls_total{tool_name="web_search",agent="research_agent"} 89
rastir_retrieval_calls_total{agent="research_agent"} 156
Nested Spans
Rastir automatically links parent–child relationships for agent call trees:
@agent(agent_name="supervisor")
def supervisor(task):
plan = planner(task) # nested agent
return executor(plan)
@agent(agent_name="planner")
def planner(task):
return ask_llm(task) # nested LLM call
@llm(model="gpt-4o")
def ask_llm(prompt):
return openai.chat(messages=[...])
supervisor (agent, 3200ms)
├── planner (agent, 1100ms)
│ └── ask_llm (llm, 980ms) → model=gpt-4o, tokens_in=150, tokens_out=85
└── executor (agent, 2000ms)
├── web_search (tool, 450ms)
└── ask_llm (llm, 1200ms) → model=gpt-4o, tokens_in=320, tokens_out=200
Works with LangGraph
from langgraph.prebuilt import create_react_agent
app = create_react_agent(ChatOpenAI(model="gpt-4o-mini"), tools=[search, calc])
@agent(agent_name="react_agent")
def run(query: str):
return app.invoke({"messages": [HumanMessage(query)]})
# Rastir auto-detects LangGraph state → LangChain messages → OpenAI response
# Extracts: model, tokens, tool calls, message counts — zero config
Key Metrics at a Glance
| Metric | Type | What it tracks |
|---|---|---|
rastir_llm_calls_total |
Counter | LLM invocations by model, provider, agent |
rastir_tokens_input_total |
Counter | Input token consumption |
rastir_tokens_output_total |
Counter | Output token consumption |
rastir_duration_seconds |
Histogram | Latency with P50/P95/P99 + exemplars |
rastir_tool_calls_total |
Counter | Tool invocations by name and agent |
rastir_errors_total |
Counter | Failures by span type and error type |
rastir_queue_size |
Gauge | Collector backpressure indicator |
Full metrics reference → Server Documentation
Server Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/telemetry |
Ingest span batches |
| GET | /metrics |
Prometheus exposition |
| GET | /v1/traces |
Query trace store |
| GET | /health |
Liveness probe |
| GET | /ready |
Readiness probe (queue pressure) |
Configuration
Configure via configure() call or environment variables:
configure(
service="my-app",
env="production",
push_url="http://collector:8080/v1/telemetry",
api_key="secret",
batch_size=100,
flush_interval=5,
)
Or equivalently:
export RASTIR_SERVICE=my-app
export RASTIR_ENV=production
export RASTIR_PUSH_URL=http://collector:8080/v1/telemetry
Full configuration reference → Configuration Documentation
Project Structure
src/rastir/
├── __init__.py # Public API: configure, trace, agent, llm, tool, retrieval
├── config.py # GlobalConfig, configure()
├── context.py # Span & agent context (ContextVar-based)
├── decorators.py # All decorator implementations
├── spans.py # SpanRecord data model
├── queue.py # Bounded in-memory span queue
├── transport.py # TelemetryClient + BackgroundExporter
├── adapters/ # Auto-detection for OpenAI, Anthropic, Bedrock, LangChain, LangGraph
└── server/ # FastAPI collector with Prometheus, trace store, OTLP export
Development
pip install -e ".[all]" # editable install with all extras
pytest # 337 tests (unit + integration)
ruff check src/ tests/ # linting
Documentation
Full documentation at skamalj.github.io/rastir:
- Getting Started — Installation, quick start, nested spans
- Decorators —
@trace,@agent,@llm,@tool,@retrieval,@metric - Adapters — OpenAI, Anthropic, Bedrock, LangChain, LangGraph
- Server — Collector, metrics, histograms, exemplars, OTLP
- Configuration — Client & server config reference
- Contributing Adapters — Write your own adapter
License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rastir-0.1.0b2.tar.gz.
File metadata
- Download URL: rastir-0.1.0b2.tar.gz
- Upload date:
- Size: 474.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d444ca6be74fbcaddfad914bb285d385c7c343d27c7d0bd34025814735ad8afd
|
|
| MD5 |
1325ab9859e5bfef8bbb590c974b08ac
|
|
| BLAKE2b-256 |
74fcde82a79d5752ab0ed0b4947512e7fadb3769cde9e7cc3b042476a796a959
|
File details
Details for the file rastir-0.1.0b2-py3-none-any.whl.
File metadata
- Download URL: rastir-0.1.0b2-py3-none-any.whl
- Upload date:
- Size: 55.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18fa32bd6c2b67c3114a9b0855117497fbf38e05f8edc7131fcf490bc7a8f648
|
|
| MD5 |
a4c5546289cdd7c2ae715c6c9f1b0202
|
|
| BLAKE2b-256 |
735d80930a62ba9b575f1ce0050eb25ccd0dceaedd9f37fa2dc1a62ff3353d6b
|