Skip to main content
Avatar for Mukunda Katta from gravatar.com

Mukunda Katta

Username    mukundaraokatta
Date joined   Joined

81 projects

llm-retry-py

Last released

Exponential backoff with full jitter for LLM API calls. Sync + async. Built-in retryable-code presets for Anthropic, OpenAI, Bedrock, Gemini. Zero runtime deps.

llm-message-hash-py

Last released

Stable canonical sha256 hash of LLM request/message structures. Recursive key-sorted JSON canonicalization with per-provider presets that drop noise fields. For cache keys and idempotency. Zero runtime deps.

agentidemp-py

Last released

Idempotency keys for LLM agent retries. Deterministic content-derived keys (sha256-hex, UUIDv5, scoped). Pairs with cachebench miss-aware retry. Zero runtime deps.

tool-output-truncate-py

Last released

Truncate LLM tool output with head/tail/middle/middle_lines strategies. UTF-8 safe, zero runtime deps.

birddog

Last released

Audited Bright Data egress for AI scraping agents. Domain allowlist, per-domain rate caps, JSONL audit, Streamlit dashboard, optional Web Unlocker proxy, and optional on-close Merkle attestation via mantle-agent-attest.

geminilens

Last released

Drop-in observability for Gemini agents: traces, cost, drift, egress allowlist. Ships exporters for Arize Phoenix, Splunk HEC, Elastic, GitLab Observability, MongoDB Atlas, Dynatrace, TrueFoundry.

mantle-agent-attest

Last released

Verifiable agent-run attestations on the Mantle EVM L2. Hash an agent's JSONL audit log into a Merkle root, sign it, post it on-chain, and prove a single run later.

recruitertriage

Last released

Triage recruiter outreach with a small (<1B) language model. Built for the HuggingFace Build Small Hackathon.

token-budget-py

Last released

Thread-safe shared token + USD budget for concurrent LLM tasks. Raises BudgetExceeded on push past cap. Reserve/commit two-phase API for fan-out workloads. Zero runtime deps.

ragvitals

Last released

5-dimensional production drift detection for RAG systems.

ragdrift-py

Last released

5-dimensional drift detection for production RAG systems.

snipsplit

Last released

Token-aware text chunker for RAG ingestion. Sentence-respecting, overlap-friendly.

maskprompt

Last released

Sub-millisecond PII redaction for prompts before they reach an LLM.

bedrockstack

Last released

Low-level Python ergonomics for AWS Bedrock + Anthropic Claude: retries, cost ledger, streaming-error normalization.

toklab

Last released

Fast bulk tokenizer + token counter for OpenAI BPE encodings.

embedcache

Last released

Content-addressed local embedding cache. Skip duplicate embedding API calls.

agenttap

Last released

Wire-level prompt introspection for LLM SDK calls. See exactly what was sent, with credentials redacted by default. Anthropic, OpenAI, any httpx-based client.

llmfleet

Last released

Fleet-level batch dispatcher for LLM APIs. Pool requests across coroutines, route to provider Batch APIs, save 50% on cost without rewriting your agent loops.

bedrockcache

Last released

Audit and fix Anthropic prompt caching on AWS Bedrock through any abstraction stack.

cachebench

Last released

Prompt-cache observability for LLM APIs. Per-call hit ratios, cost saved, regression alerts, miss-aware retry. Anthropic + OpenAI + Bedrock.

bedrock-ops

Last released

Production-grade boto3 toolkit for AWS Bedrock: typed retry, per-model timeouts, capability lookup, full token usage with cache fields, PII-safe Guardrails.

embspec

Last released

Embedding pipeline ops + drift detection for production RAG: index manifests, version assertions, neighbor-stability eval, Drift-Adapter for in-place model migrations.

bedrock-kit

Last released

Small, opinionated AWS Bedrock client wrapper: adaptive throttle, cache-aware cost tracking, and structured-output parse-and-repair. Single-cloud, single-purpose.

driftvane

Last released

Compose drift detectors (embedding, retrieval, response, latency) into one report. Library-only, no server, no UI.

prompt-injection-shield-cli

Last released

CLI wrapper for prompt-injection-shield-py: scan a file or stdin for prompt-injection patterns.

ml-intern-lab

Last released

Tiny reproducible ML experiment runner.

agent-skills-playbook

Last released

Validate and render portable AI agent skills.

browser-research-agent

Last released

Research agent that turns sources into cited Markdown briefs.

personal-agent-harness

Last released

A tiny local-first personal AI agent harness.

llm-response-schema-lite-py

Last released

Tiny schema validator for structured LLM responses. Python port of @mukundakatta/llm-response-schema-lite.

prompt-version-diff-py

Last released

Diff prompt templates and flag risky instruction changes. Python port of @mukundakatta/prompt-version-diff.

prompt-token-trim-py

Last released

Trim prompt messages to fit a token budget while preserving priority. Python port of @mukundakatta/prompt-token-trim.

context-window-packer-py

Last released

Pack context chunks into a budget by relevance and priority. Python port of @mukundakatta/context-window-packer.

designlint-py

Last released

HTML/CSS accessibility and design linter: contrast, touch targets, headings, form labels, leaked secrets. Stdlib-only Python port of @mukundakatta/designlint.

context-forge-py

Last released

Context engineering toolkit for ranking, packing, and risk-scanning RAG context. Python port of @mukundakatta/context-forge.

context-drift-detector-py

Last released

Detect topic drift between user intent, retrieved context, and AI answers. Python port of @mukundakatta/context-drift-detector.

retrieval-acl-filter-py

Last released

Enforce document ACLs after retrieval and before prompting. Python port of @mukundakatta/retrieval-acl-filter.

rag-staleness-auditor-py

Last released

Find stale RAG chunks by age, version, and freshness requirements. Python port of @mukundakatta/rag-staleness-auditor.

skillint-py

Last released

Lint Claude Code SKILL.md files for frontmatter, required fields, descriptions, and hardcoded secrets. Stdlib-only Python port of @mukundakatta/skillint.

mcpcheck-py

Last released

Lint MCP config files for Claude Desktop, Claude Code, Cursor, Cline, Windsurf, and Zed. Stdlib-only Python port of @mukundakatta/mcpcheck.

kavach-py

Last released

Small, inspectable threat-scoring library for AI-app security monitoring. Zero-dep Python port of @mukundakatta/kavach.

consent-redaction-log-py

Last released

Record consent-aware redactions for privacy review trails. Zero-dep Python port of @mukundakatta/consent-redaction-log.

jailbreak-corpus-mini-py

Last released

Small local jailbreak and prompt-injection fixture set for tests. Python port of @mukundakatta/jailbreak-corpus-mini.

tool-result-taint-py

Last released

Track untrusted tool output before it enters prompts or actions. Python port of @mukundakatta/tool-result-taint.

ai-supply-chain-manifest-py

Last released

Build and validate lightweight AI model / data / tool manifests. Python port of @mukundakatta/ai-supply-chain-manifest.

model-router-policy-py

Last released

Policy-based model routing by capability, cost, latency, and privacy. Python port of @mukundakatta/model-router-policy.

model-fallback-planner-py

Last released

Plan model fallback chains from capability, cost, and health data. Python port of @mukundakatta/model-fallback-planner.

llm-trace-sampler-py

Last released

Sample LLM traces by risk, errors, latency, and deterministic ids. Python port of @mukundakatta/llm-trace-sampler.

eval-dataset-smith-py

Last released

Generate balanced AI eval fixtures from source examples, bugs, docs, and policies. Python port of @mukundakatta/eval-dataset-smith.

tool-permission-gate-py

Last released

Policy-check agent tool calls before execution. Python port of @mukundakatta/tool-permission-gate.

tool-call-contracts-py

Last released

Validate LLM tool-call payloads with small JSON-like contracts. Python port of @mukundakatta/tool-call-contracts.

agent-trajectory-replay-py

Last released

Replay and diff AI agent event trajectories for debugging regressions. Python port of @mukundakatta/agent-trajectory-replay.

agent-regression-lens-py

Last released

Detect regressions between baseline and current AI agent runs. Python port of @mukundakatta/agent-regression-lens.

agent-loop-breaker-py

Last released

Detect repeated agent steps and stop runaway loops. Python port of @mukundakatta/agent-loop-breaker.

mk-agentkit

Last released

The agent reliability stack in one install: agentfit + agentguard + agentsnap + agentvet + agentcast (Python ports).

embedding-dedupe

Last released

Deduplicate near-identical embedding records by cosine similarity. Pure Python, zero runtime deps. Python port of @mukundakatta/embedding-dedupe.

vector-poison-score

Last released

Score (query, document) pairs for vector/RAG poisoning signals: vector-text mismatch, instruction-like payloads, NaN, suspiciously round numbers. Python port of @mukundakatta/vector-poison-score.

rag-quality-kit

Last released

Heuristic quality metrics for RAG retrieval and grounded answers. Python port of @mukundakatta/rag-quality-kit.

llm-cost-guard-py

Last released

Estimate LLM request cost and enforce per-request or per-session budgets. Python port of @mukundakatta/llm-cost-guard.

eval-flake-detector

Last released

Detect flaky LLM eval cases across repeated runs. Pass-rate + standard-deviation per case, with per-case severity. Python port of @mukundakatta/eval-flake-detector.

semantic-cache-key

Last released

Stable semantic cache keys for LLM requests. Invariant to whitespace, casing, and key ordering; sensitive to model swaps, tool list, and retrieval context. Python port of @mukundakatta/semantic-cache-key.

system-prompt-leak-scan

Last released

Detect system prompt leakage in LLM model outputs via known patterns, configured-prompt substring matching, and unique fingerprint phrases. Python port of @mukundakatta/system-prompt-leak-scan.

hallucination-risk-meter

Last released

Estimate hallucination risk in LLM answers from uncertainty language, unsupported specifics, citations, and context coverage. Python port of @mukundakatta/hallucination-risk-meter.

citation-integrity-check

Last released

Verify answer citations refer to supplied source ids and that cited sources actually support the claims. Python port of @mukundakatta/citation-integrity-check.

llm-output-sanitizer-py

Last released

Sanitize LLM outputs before HTML, SQL, shell, or markdown sinks. Python port of @mukundakatta/llm-output-sanitizer.

prompt-injection-shield-py

Last released

Scan retrieved text for prompt-injection risk before adding it to model context. Python port of @mukundakatta/prompt-injection-shield.

pii-sentry-py

Last released

Detect and redact PII and secret-like values before logging or sending text to AI providers. Python port of @mukundakatta/pii-sentry.

partial-json-stream

Last released

Streaming JSON parser that yields partial valid trees as tokens arrive. For LLM tool calls, structured outputs, and partial recovery.

agentguard-firewall

Last released

Network egress firewall for AI agents. Declarative allow/deny list of hosts your agent tools may reach. Python port of @mukundakatta/agentguard.

agentcast-py

Last released

Structured output for any LLM call. Validate-and-retry loop for JSON responses; BYO LLM and validator. Python port of @mukundakatta/agentcast.

agentvet-py

Last released

Validate LLM-generated tool args before execution. Wraps tool functions with arg validation, raises ToolArgError with LLM-friendly retry hint. Python port of @mukundakatta/agentvet.

agentsnap-py

Last released

Snapshot tests for AI agents. Record an agent's tool-call trace, diff against a baseline, fail CI on regressions. Python port of @mukundakatta/agentsnap.

agentfit-py

Last released

Fit your messages into the LLM context window. Token-aware truncation with multiple strategies, pluggable tokenizers. Python port of @mukundakatta/agentfit.

agent-run-diff

Last released

Compare baseline vs current agent runs and surface regressions as structured reasons: success loss, new errors, failed tool calls, output drift, step/latency/cost bloat.

llm-usage-report

Last released

Parse LLM API response logs (Anthropic, OpenAI, Google) and generate token / cost reports. Supports a --alert-at budget alarm that exits non-zero when total cost exceeds a threshold. No framework adoption required.

ai-eval-forge

Last released

Zero-dependency eval harness for LLM and agent regression testing. Scores outputs with exact, contains, regex, JSON, citation, and token-F1 checks. Compares two runs to flag regressions.

codex-skill-kit

Last released

Scaffold and validate Codex skills from the command line.

claude-hooks-check

Last released

Linter for Claude Code hooks configuration (the 'hooks' block of settings.json). Validates event names, matcher shape, command entries, and flags dangerous commands or hardcoded secrets.

claude-commands-check

Last released

Linter for Claude Code slash-command files (.claude/commands/*.md). Validates YAML frontmatter, allowed-tools shape, description quality, and flags hardcoded secrets.

mcp-config-check

Last released

Linter for MCP (Model Context Protocol) config files used by Claude Desktop, Cursor, Cline, Windsurf, and Zed. CLI + library API.

claude-skill-check

Last released

Linter for Claude Code SKILL.md files. Validates YAML frontmatter, required fields, description length, and common secret patterns.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page