Save 30% on AI agent costs. One line of code. No accuracy loss.
Project description
AgentSave โ Save 30% on AI agent costs. One line of code.
The first AI agent efficiency platform. Drop-in Python supervisor + real-time cost dashboard + inference router. Zero accuracy loss.
flowchart LR
SDK["๐ก pip install agentsave"]
WRAP["supervise(agent)"]
SUP["Supervisor\nContext Filter + Early Exit\n+ Budget Gate"]
TEL["TelemetryClient\n(opt-in, zero PII)"]
API["Dashboard Backend\nFastAPI + SQLite"]
UI["agentsave-ui\nNext.js Dashboard"]
IR["InferRoute\nDocker Sidecar"]
VLLM["vLLM / SGLang\nCluster"]
SDK --> WRAP
WRAP --> SUP
SUP -->|async fire-and-forget| TEL
TEL --> API
API --> UI
SUP -->|"~30% token reduction"| WRAP
IR -->|"~68% TTFT reduction"| VLLM
API -.->|Enterprise tier| IR
๐ฅ The Problem
- Every LLM agent wastes 30โ50% of tokens on irrelevant tool outputs โ inflating costs with no accuracy gain
- Agents over-iterate past diminishing returns, burning tokens on iterations that add nothing
- Developers have zero visibility into which agents, models, and frameworks are costing them the most
โก The Solution
SDK Layer
pip install agentsave, then wrap any agent with supervise(agent). The supervisor filters irrelevant context, exits early on diminishing returns, and enforces a budget gate โ delivering ~30% token reduction with zero changes to your agent's internals.
Dashboard Layer
Real-time cost tracking across every run, with a per-framework breakdown, an hourly activity heatmap, and an interactive cost projector to forecast monthly savings.
InferRoute Layer
PPD (append-prefill decode) routing for multi-turn agent workloads, delivering ~68% Turn 2+ TTFT reduction. Available on the Enterprise tier as a Docker sidecar in front of your vLLM / SGLang cluster.
๐ฌ In Action
- Overview dashboard โ real-time savings stats with animated counters
- Analytics โ token reduction trend over time (area/line/bar toggle)
- Agent Runs โ full run history with framework badges and reduction %
- Cost Projector โ interactive sliders to project monthly savings
- Live Activity Feed โ real-time agent run stream
- Hourly Heatmap โ GitHub-style activity grid
- Command Palette โ instant navigation and actions (โK)
- Billing โ Free / Pro / Enterprise tiers
๐ Quick Start
- Install the SDK:
pip install git+https://github.com/aks-builds/agentsave.git - Wrap your agent:
from agentsave import supervise agent = supervise(your_agent)
- Run your agent normally โ token savings happen automatically
- Connect dashboard:
agentsave login - View savings:
agentsave status - (Optional) Start dashboard backend:
cd agentsave-dashboard && uvicorn agentsave_dashboard.main:app - (Enterprise) Deploy InferRoute:
docker run -p 8080:8080 agentsave/inferroute:latest
๐ฆ Installation
SDK (all agent frameworks):
v0.1.0 is not yet on PyPI. Install directly from GitHub until the first release is tagged and published:
# Install from GitHub (current)
pip install git+https://github.com/aks-builds/agentsave.git
# With specific framework support:
pip install "git+https://github.com/aks-builds/agentsave.git#egg=agentsave[langchain]"
pip install "git+https://github.com/aks-builds/agentsave.git#egg=agentsave[all]"
Once v0.1.0 is released to PyPI (trigger with Actions โ release โ Run workflow):
pip install agentsave
# With specific framework support:
pip install "agentsave[langchain]" # LangChain + LangGraph
pip install "agentsave[autogen]" # AutoGen
pip install "agentsave[crewai]" # CrewAI
pip install "agentsave[smolagents]" # Smolagents
pip install "agentsave[all]" # All frameworks
Dashboard Backend:
git clone https://github.com/aks-builds/agentsave-dashboard
cd agentsave-dashboard
pip install -e ".[dev]"
uvicorn agentsave_dashboard.main:app --port 8000
Dashboard UI:
git clone https://github.com/aks-builds/agentsave-ui
cd agentsave-ui
npm install
npm run dev # http://localhost:3000
InferRoute (Enterprise, requires Docker):
docker run -d -p 8080:8080 \
-e BACKEND_URL=http://your-vllm:8000 \
-e BACKEND_TYPE=vllm \
-e AGENTSAVE_TOKEN=$ENTERPRISE_TOKEN \
agentsave/inferroute:latest
๐ Architecture
- Drop-in, zero-modification:
supervise(agent)wraps any agent framework without touching internals - LLM-free context filter: TF-IDF cosine similarity โ no extra API calls, <1ms overhead per observation
- ICLR 2026 research-backed: 29.68% token reduction on GAIA benchmark (arXiv:2510.26585)
- Five framework adapters: LangChain, LangGraph, AutoGen, CrewAI, Smolagents โ all tested
- InferRoute PPD routing: ~68% Turn 2+ TTFT reduction via append-prefill decode routing (ICML 2026, arXiv:2603.13358)
- Opt-in telemetry: zero PII โ only run_id, framework, model, token counts, success flag
- Self-hostable: dashboard backend is MIT-licensed FastAPI + SQLite; InferRoute is a Dockerfile drop-in
๐บ Roadmap
v0.2:
- JavaScript/TypeScript SDK for Node.js agent frameworks
- Real-time WebSocket events for the live feed
- Team workspaces with RBAC
v0.3:
- OpenAI Responses API adapter
- Anthropic tool_use adapter
- Cost anomaly alerts (email + webhook when a run exceeds threshold)
Tracked as GitHub Issues.
๐ Project Structure
agentsave/ โ SDK (this repo)
โโโ agentsave/ โ Python package
โ โโโ core/ โ context filter, early exit, budget gate, supervisor
โ โโโ adapters/ โ LangChain, LangGraph, AutoGen, CrewAI, Smolagents
โ โโโ telemetry/ โ opt-in async telemetry client
โ โโโ cli/ โ agentsave login/status/config
โโโ tests/ โ 60 unit tests
agentsave-dashboard/ โ FastAPI + SQLite backend
โโโ agentsave_dashboard/
โ โโโ routers/ โ /api/events, /api/metrics, /api/tokens, /api/billing
โ โโโ services/ โ metrics aggregation, Stripe billing
โโโ tests/ โ 47 tests
agentsave-ui/ โ Next.js 16 dashboard
โโโ app/
โ โโโ components/ โ StatCard, charts, RunsTable, ActivityFeed, CommandPalette
โ โโโ (routes)/ โ /, /analytics, /runs, /frameworks, /cost, /settings
โโโ tests/e2e/ โ 30 Playwright tests
agentsave-inferroute/ โ Enterprise inference router
โโโ inferroute/
โ โโโ classifier.py โ Turn 1 vs Turn 2+ detection
โ โโโ router.py โ PPD scoring function
โ โโโ adapters/ โ vLLM + SGLang
โโโ tests/ โ 59 tests
๐ค Contributing
See CONTRIBUTING.md for setup instructions, code style, and the PR checklist.
๐ License
MIT ยฉ 2026 Aditya Kumar Singh
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentsave-0.1.0.tar.gz.
File metadata
- Download URL: agentsave-0.1.0.tar.gz
- Upload date:
- Size: 904.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2066d0f77bc26b7a8c695ebd56b9dc43140cd20cc7cd3fdd5187fc0ac9612a73
|
|
| MD5 |
292c57d876567a534f0f00f8cc8a9cc9
|
|
| BLAKE2b-256 |
96955a00c12af42831909997ce3b68c62df7f8511732a05c2e764b26b6bbe68d
|
File details
Details for the file agentsave-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agentsave-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3684d9d8b4b7580d518b433d3957549ed6d3ff671538b60fd21dba76c9e7365f
|
|
| MD5 |
c8f342c96dc5e336743c8c92adab78b8
|
|
| BLAKE2b-256 |
247165a2f7e29443823f092742d2452ccab26679385a04718f31f6cc7604a07e
|