Skip to main content

AI Observability & Cost Intelligence — track token costs, latency, and hallucination risk

Project description

nirixa

AI Observability & Cost Intelligence — track token costs, latency, and hallucination risk for every LLM call, with zero friction.

pip install nirixa

Quick Start

import nirixa
from openai import OpenAI

nirixa.init(api_key="nirixa-your-key")
client = OpenAI()

ai = nirixa.wrap(client, feature="/api/chat")

response = ai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
nirixa.flush()

Four Ways to Integrate

1. wrap() — Transparent client proxy (recommended)

Wrap a provider client once and use it exactly like the original. Model, provider, prompt, and request params are auto-extracted from every call.

from nirixa import NirixaClient
from openai import OpenAI

nirixa = NirixaClient(api_key="nirixa-your-key")
openai  = OpenAI()

ai = nirixa.wrap(openai, feature="/api/chat", user=user_id)

response = ai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

Works with any provider:

import anthropic

claude = nirixa.wrap(anthropic.Anthropic(), feature="/api/analyze")
response = claude.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarize this..."}]
)

2. track() — Explicit per-call wrapping

prompt = "Summarize this document..."
response = nirixa.track(
    feature="/api/summarize",
    user="user-123",
    prompt=prompt,
    prompt_version="v2-concise",   # optional: A/B test prompt versions
    fn=lambda: openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
)

3. observe() — Decorator style

import nirixa
from openai import OpenAI

nirixa.init(api_key="nirixa-your-key")
openai = OpenAI()

@nirixa.observe(feature="/api/chat")
def call_llm(messages, model="gpt-4o"):
    return openai.chat.completions.create(model=model, messages=messages)

# Async supported too
@nirixa.observe(feature="/api/chat", prompt_arg="messages", model_arg="model")
async def call_llm_async(messages, model="gpt-4o"):
    return await openai.chat.completions.create(model=model, messages=messages)

4. Auto-patch — Zero code changes

from nirixa import NirixaClient
from nirixa.middleware import patch_openai, patch_all

nirixa = NirixaClient(api_key="nirixa-your-key")

patch_openai(nirixa, feature="/api/chat")  # patch a specific provider
patch_all(nirixa)                          # or patch everything installed
# [nirixa] Patched 4 providers: OpenAI, Anthropic, Groq, Gemini

Module-level API

import nirixa

nirixa.init(api_key="nirixa-your-key")

response = nirixa.track(feature="/api/chat", fn=lambda: openai.chat.completions.create(...))
ai       = nirixa.wrap(openai_client, feature="/api/chat")

nirixa.flush()  # always call before script exit

Agent Tracing

Group multi-step agent runs into a single observable trace — with aggregated cost, token, and latency totals, and a waterfall view in the dashboard.

import nirixa
from openai import OpenAI

nirixa.init(api_key="nirixa-your-key")
client = OpenAI()
ai = nirixa.wrap(client, feature="agent/classify")

async with nirixa.agent("research-agent") as agent:
    with agent.step("classify"):
        r1 = ai.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Classify this query: ..."}]
        )

    with agent.step("answer"):
        r2 = ai.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Answer based on: ..."}]
        )

    # Track a non-LLM tool call inside the same trace
    result = nirixa.tool("db_lookup", lambda: db.find(query=r1.choices[0].message.content))

nirixa.flush()

Every track() call inside an agent() block automatically inherits the trace_id. The root agent span is written last with aggregated totals. View traces at Dashboard → Agents.

Also works as a decorator:

@nirixa.agent("summarize-pipeline")
def run_pipeline(doc):
    ...

Prompt Version Tracking

A/B test prompt versions and compare cost, latency, and hallucination score across versions in the dashboard.

# Tag calls with a version string
response = nirixa.track(
    feature="/api/chat",
    prompt_version="v3-concise",
    fn=lambda: openai.chat.completions.create(...)
)

View version performance at Dashboard → Prompts.


Request Replay

Re-run any logged call locally with a different model to compare cost and output. No credentials are stored — replay executes on your machine using your existing env API key.

# First make a tracked call — request params are stored automatically
ai = nirixa.wrap(OpenAI(), feature="/api/chat")
ai.chat.completions.create(model="gpt-4o", messages=[...])
nirixa.flush()

# Grab the call_id from the dashboard or logs endpoint, then replay
result = nirixa.replay("call-id-here")
print(result["response_text"])
print(f"Cost delta: ${result['cost_delta']:.6f}")

# Swap to a cheaper model
result = nirixa.replay("call-id-here", model_override="gpt-4o-mini")
print(f"Saved ${-result['cost_delta']:.6f}")

replay() returns:

Key Type Description
response_text str | None The new response
original_cost float Cost of the original call
replay_cost float Cost of the replay
cost_delta float replay_cost - original_cost (negative = savings)
replay_call_id str New call_id logged for this replay

Supported providers for replay: OpenAI, Anthropic, Google Gemini, Groq.


LLM-as-Judge

Get a factual grounding score from a second LLM after every call. Requires capture_response=True.

nirixa = NirixaClient(
    api_key="nirixa-your-key",
    capture_response=True,
    judge_enabled=True,
)

Results appear in the log detail drawer under LLM-as-Judge. Judge model can be changed in Dashboard → Alerts → LLM-as-Judge.


Configuration

nirixa = NirixaClient(
    api_key="nirixa-your-key",          # Required
    host="https://api.nirixa.in",       # Default
    score_hallucinations=True,          # Heuristic hallucination scoring (LOW/MEDIUM/HIGH)
    capture_response=False,             # Store prompt_text + response_text (needed for judge)
    judge_enabled=False,                # Fire LLM-as-Judge after every call (requires capture_response)
    async_ingest=True,                  # Non-blocking — zero added latency
    debug=False,                        # Log each tracked call to console
)

Supported Providers

Provider Auto-detected via Patch function
OpenAI choices + usage patch_openai
Anthropic content + usage patch_anthropic
Groq OpenAI-compatible shape patch_groq
Google Gemini usage_metadata patch_gemini
Mistral OpenAI-compatible shape patch_mistral
Together AI OpenAI-compatible shape patch_together
Ollama prompt_eval_count patch_ollama
AWS Bedrock ResponseMetadata

What Gets Tracked

Metric Description
Token cost Per-call USD cost by feature and model
Latency p50 / p95 / p99 response times
Hallucination risk LOW / MEDIUM / HIGH heuristic scoring
Prompt drift Output variance over time
Error rate Failed calls by feature
Prompt version Per-version cost, latency, halluc score
Agent traces Grouped runs with waterfall view
Request params Full provider kwargs stored for replay

flush() — Before script exit

nirixa = NirixaClient(api_key="nirixa-your-key")
# ... your code ...
nirixa.flush()

Install with provider extras

pip install "nirixa[openai]"
pip install "nirixa[anthropic]"
pip install "nirixa[gemini]"
pip install "nirixa[all]"

Links


निरीक्षा — Observe everything.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nirixa-2.2.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nirixa-2.2.0-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file nirixa-2.2.0.tar.gz.

File metadata

  • Download URL: nirixa-2.2.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nirixa-2.2.0.tar.gz
Algorithm Hash digest
SHA256 c28e84544cca685f7c6209e30059cc3b05feb24d18f71e7b64c108807fe07a52
MD5 c7cb0d236cf20e7f239f12e21b633c5c
BLAKE2b-256 04215b322125955c4fc32d40c5fa67ad961ae89e44302f6ef23daed3efe3ea30

See more details on using hashes here.

File details

Details for the file nirixa-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: nirixa-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for nirixa-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6110c7f0e11f717eaa520b47f334b6d86983d5f85341958e100c4186bdc3ebd4
MD5 c9aa0d9874e4eb1fb7e52a3e62e3e744
BLAKE2b-256 0282679fc2e6026982479c318802116d7ceee61721d4768a24de581e030de0a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page