24 projects
graphsift
graphsift: intelligent code context selection for LLMs. AST dependency graph, BM25+graph ranked relevance, token-budget-aware selection, 14 languages, decorator+dynamic import edges, 80-150x token reduction. Beats code-review-graph (F1 0.85 vs 0.54).
tokenpruner
Slash LLM input tokens by 70-80% — compress prompts, code, and conversations for Claude, GPT-4, and any LLM without losing meaning
scope3track
Carbon and Scope 3 emissions tracking — GHG Protocol, emission hotspot analysis, Net Zero roadmap generation, SBTi alignment, CSRD-ready reporting
royaltyguard
Creator royalty tracking and streaming fraud detection — bot streams, zero-rate payouts, DSP reconciliation, earnings forecasting, fraud pattern library
inventra
Multi-channel inventory sync for eCommerce — real-time conflict resolution, reorder point calculation, ABC/XYZ inventory analysis, demand forecasting, oversell prevention
cyberscorecard
SMB cybersecurity governance scorecard — CIS Controls v8, Zero Trust scoring, IR playbook generation, threat intelligence feed, attack surface mapping, compliance gap analysis
returnguard
Returns fraud detection for retail and eCommerce — wardrobing, serial returner, refund anomaly detection, behavioral fingerprinting, policy simulation
llm-injection-guard
Drop-in prompt injection defense for LLM apps and AI agents — detect, sanitize, block, and audit injection attacks in real time. Includes multi-turn session scanning, allow-lists, rate-abuse detection, multi-layer scanner, FastAPI and Flask middleware.
promptci
Prompt versioning with CI/CD regression gates — version, test, diff, and deploy prompts with quality gates, schema evolution, PII scrubbing, and full observability
trajscore
Production-grade agentic trajectory evaluation — score multi-step AI agent runs on goal completion, tool accuracy, step efficiency, reasoning coherence, loop detection, and faithfulness
llm-token-optimizer
Token cost control and auto-optimization for LLM apps — compress prompts, estimate costs, enforce budgets, route to cheap models, and cut LLM spend by up to 60%
llm-watchdog
Production-grade silent failure detection for LLM applications — hallucination alerts, PII leak detection, semantic drift, topic guard, and real-time observability
llm-extractor
Extract structured, validated JSON from any LLM — OpenAI, Anthropic, Gemini — with batch extraction, caching, per-field confidence scoring, schema evolution, multi-schema extraction, output transforms, partial extraction, extraction diff, pipeline extraction, and smart auto-retry.
agentguard-llm
Production-grade fault tolerance for AI agents — circuit breakers, LLM-aware retry, idempotency, loop detection, fallback chains, async support, health monitoring, and budget enforcement for LangChain, AutoGen, CrewAI, and any LLM pipeline
llmgrader
Open-source LLM evaluation framework — 50+ metrics for RAG, agents, safety, async eval, regression tracking, custom benchmarks, and exportable reports
pandasv2
pandas drop-in replacement: JSON serialization, DataFrame pipeline, diff tracking, column validation, streaming export, caching, FastAPI/Flask/Django integration
numpy2
Pure-Python NumPy drop-in: full NumPy API + JSON serialization, array compression, pipeline transforms, schema validation, zero dependencies
providercontract
Cross-provider schema contract testing for LLMs. Define once, validate everywhere — OpenAI, Anthropic, Mistral, LiteLLM and any JSON-returning model.
semanticheck
pytest-native semantic assertions for LLM and generative AI applications. No servers. No SaaS. Works with OpenAI, Anthropic, LiteLLM and any LLM client.
promptfiles
LLM prompts as versioned YAML files — git-trackable, renderable, and diffable. Works with OpenAI, Anthropic, LiteLLM, and any LLM client.
genassert
pytest-native semantic testing for LLM and generative AI applications. No servers. No SaaS. Works with OpenAI, Anthropic, LiteLLM and any LLM client.
pandas-numpy-lib
A library to combine pandas and numpy functionalities.
custom-magics
Custom Magic for Jupyter AI
create-testing-pypi-maheshmakwana787
Streaming video data via networks