Skip to main content

Astra Observability - Tracing, metrics, and structured logging for AI agents

Project description

Astra Observability Package

A comprehensive observability solution for the Astra AI platform, providing distributed tracing, metrics collection, and structured logging with automatic trace correlation.

Features

🔍 Distributed Tracing

  • OpenTelemetry-based: Industry-standard distributed tracing
  • Async-first: Designed for async Python applications
  • Decorator support: Easy-to-use @trace_span decorators
  • Context propagation: Automatic trace context across async operations
  • Console export: MVP-ready with console output (easily switchable to OTLP)

📊 Metrics Collection

  • Prometheus-compatible: Standard metrics format
  • Agent performance: Run counts, latencies, success rates
  • Model usage: Token tracking, cost calculation, TTFT metrics
  • Tool execution: Call counts, durations, error rates
  • Cost tracking: Built-in cost calculation for major LLM providers

📝 Structured Logging

  • Loguru-powered: Clean, powerful logging API
  • JSON formatting: Structured logs for easy parsing
  • Trace correlation: Automatic trace/span ID injection
  • Context propagation: Agent, session, request IDs
  • Performance optimized: Async-friendly, non-blocking

Quick Start

Installation

cd packages/observability
pip install -e .

Basic Usage

from observability import init_observability

# Initialize observability
obs = init_observability(
    service_name="astra",
    environment="dev",
    log_level="INFO"
)

# Use in agent code
@obs.trace_agent_run("my-agent")
async def run_agent():
    obs.info("Agent started", agent_id="my-agent")

    # Model call with automatic metrics
    cost = obs.calculate_model_cost("gpt-4", "openai", 100, 50)
    obs.record_model_usage("gpt-4", "openai", 100, 50, cost)

    # Tool call with timing
    with obs.timer(obs.record_tool_call, tool_name="web_search"):
        # Tool execution here
        pass

    obs.info("Agent completed", agent_id="my-agent")

Advanced Usage

from observability import Observability, Tracer, MetricsRecorder, Logger

# Use components separately
tracer = Tracer("astra", "prod")
metrics = MetricsRecorder("astra")
logger = Logger("astra", "prod", log_level="INFO")

# Manual span management
@tracer.trace_span("custom.operation", {"component": "data_processor"})
async def process_data():
    tracer.add_event("processing.started")
    # ... processing logic ...
    tracer.set_attribute("items.processed", 100)

Architecture

Components

  1. Observability: Main facade providing unified access to all observability features
  2. Tracer: OpenTelemetry-based distributed tracing with span management
  3. MetricsRecorder: Prometheus-compatible metrics collection with cost tracking
  4. Logger: Loguru-based structured logging with trace correlation

Integration Points

  • Framework Layer: Agents use @obs.trace_agent_run() decorators
  • Model Clients: Automatic token/cost tracking via obs.record_model_usage()
  • Tool Registry: Tool calls traced with @obs.trace_tool_call()
  • Session Management: Context propagation via session_id, request_id

Configuration

Environment Variables

# Service identification
ASTRA_SERVICE_NAME=astra
ASTRA_ENVIRONMENT=dev

# Logging
ASTRA_LOG_LEVEL=INFO
ASTRA_LOG_FILE=/var/log/astra.json

# Tracing (future OTLP support)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_SERVICE_NAME=astra

Programmatic Configuration

obs = Observability.init(
    service_name="astra",
    environment="prod",
    log_level="WARNING",
    enable_json_logs=True,
    log_file="/var/log/astra.json"
)

Output Examples

Trace Output (Console)

{
  "name": "astra.framework.agent.run",
  "context": {
    "trace_id": "a1b2c3d4e5f6...",
    "span_id": "1a2b3c4d...",
    "parent_id": null
  },
  "start_time": "2024-01-15T10:30:00.123Z",
  "end_time": "2024-01-15T10:30:01.456Z",
  "duration": 1333,
  "status": "OK",
  "attributes": {
    "agent_id": "research-agent",
    "session_id": "sess-123",
    "environment": "dev"
  }
}

Metrics Output (Prometheus)

# HELP astra_agent_runs_total Total number of agent runs
# TYPE astra_agent_runs_total counter
astra_agent_runs_total{agent_id="research-agent",status="success",environment="dev"} 1

# HELP astra_model_cost_usd_total Total cost in USD for model usage
# TYPE astra_model_cost_usd_total counter
astra_model_cost_usd_total{model_name="gpt-4",provider="openai",environment="dev"} 0.0045

Log Output (JSON)

{
  "timestamp": "2024-01-15T10:30:00.123Z",
  "level": "info",
  "message": "Agent execution started",
  "service": "astra",
  "environment": "dev",
  "trace_id": "a1b2c3d4e5f6...",
  "span_id": "1a2b3c4d...",
  "extra": {
    "agent_id": "research-agent",
    "session_id": "sess-123",
    "event_type": "agent_start"
  }
}

Performance Characteristics

  • Trace overhead: < 5ms per span (batched export)
  • Metrics overhead: < 2% CPU (in-memory counters)
  • Log overhead: < 1ms per log (async I/O)
  • Memory usage: ~10MB baseline + ~1KB per active span
  • Async-friendly: Non-blocking I/O operations

Dependencies

  • opentelemetry-api>=1.38.0: Tracing API
  • opentelemetry-sdk>=1.38.0: Tracing implementation
  • prometheus-client>=0.20.0: Metrics collection
  • loguru>=0.7.0: Structured logging

Future Enhancements

  • OTLP Export: Switch from console to OTLP for production
  • Sampling: Configurable trace sampling rates
  • Custom Exporters: ClickHouse, Jaeger, custom backends
  • Dashboards: Grafana dashboards for metrics visualization
  • Alerting: Prometheus alerting rules for error rates/latency

Examples

See example_usage.py for comprehensive usage examples including:

  • Full agent run with tracing, metrics, and logging
  • Manual span management
  • Metrics-only usage
  • Error handling and exception recording

Testing

# Run the example
python example_usage.py

# Install dependencies first
pip install opentelemetry-api opentelemetry-sdk prometheus-client loguru

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astra_observability-0.1.0.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astra_observability-0.1.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file astra_observability-0.1.0.tar.gz.

File metadata

  • Download URL: astra_observability-0.1.0.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for astra_observability-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3c87dfc622062fa3b998028a50f40f4fd5a0a19d880cc9658fbc414a12fbf710
MD5 16e390fc6cb4d9cebca31f614b9884c1
BLAKE2b-256 0581893a50fd0cb021bb5b326295dbff1f8f3ecaffd658dfad66921dcd4cb155

See more details on using hashes here.

File details

Details for the file astra_observability-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for astra_observability-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd502e71a30b95a4ec02cd01151f5a9bd772bcd8407674bb9d5463bba2ca5aa8
MD5 f2f9d2244db825bc0b8e2d30878acab8
BLAKE2b-256 8fe6533f8f01287779cc5a0d89e4387434790e3e509b9f0d94f857ddda36366d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page