Save 30% on AI agent costs. One line of code. No accuracy loss.

These details have not been verified by PyPI

Project links

Project description

AgentSave — Save 30% on AI agent costs. One line of code.

The first AI agent efficiency platform. Drop-in Python supervisor + real-time cost dashboard + inference router. Zero accuracy loss.

AgentSave Dashboard Overview

flowchart LR
    SDK["💡 pip install agentsave"]
    WRAP["supervise(agent)"]
    SUP["Supervisor\nContext Filter + Early Exit\n+ Budget Gate"]
    TEL["TelemetryClient\n(opt-in, zero PII)"]
    API["Dashboard Backend\nFastAPI + SQLite"]
    UI["agentsave-ui\nNext.js Dashboard"]
    IR["InferRoute\nDocker Sidecar"]
    VLLM["vLLM / SGLang\nCluster"]

    SDK --> WRAP
    WRAP --> SUP
    SUP -->|async fire-and-forget| TEL
    TEL --> API
    API --> UI
    SUP -->|"~30% token reduction"| WRAP
    IR -->|"~68% TTFT reduction"| VLLM
    API -.->|Enterprise tier| IR

🔥 The Problem

Every LLM agent wastes 30–50% of tokens on irrelevant tool outputs — inflating costs with no accuracy gain
Agents over-iterate past diminishing returns, burning tokens on iterations that add nothing
Developers have zero visibility into which agents, models, and frameworks are costing them the most

⚡ The Solution

SDK Layer

pip install agentsave, then wrap any agent with supervise(agent). The supervisor filters irrelevant context, exits early on diminishing returns, and enforces a budget gate — delivering ~30% token reduction with zero changes to your agent's internals.

Dashboard Layer

Real-time cost tracking across every run, with a per-framework breakdown, an hourly activity heatmap, and an interactive cost projector to forecast monthly savings.

InferRoute Layer

PPD (append-prefill decode) routing for multi-turn agent workloads, delivering ~68% Turn 2+ TTFT reduction. Available on the Enterprise tier as a Docker sidecar in front of your vLLM / SGLang cluster.

🎬 In Action

Overview dashboard — real-time savings stats with animated counters
Analytics — token reduction trend over time (area/line/bar toggle)
Agent Runs — full run history with framework badges and reduction %
Cost Projector — interactive sliders to project monthly savings
Live Activity Feed — real-time agent run stream
Hourly Heatmap — GitHub-style activity grid
Command Palette — instant navigation and actions (⌘K)
Billing — Free / Pro / Enterprise tiers

🚀 Quick Start

Install the SDK: pip install git+https://github.com/aks-builds/agentsave.git

Wrap your agent:

from agentsave import supervise
agent = supervise(your_agent)

Run your agent normally — token savings happen automatically
Connect dashboard: agentsave login
View savings: agentsave status
(Optional) Start dashboard backend: cd agentsave-dashboard && uvicorn agentsave_dashboard.main:app
(Enterprise) Deploy InferRoute: docker run -p 8080:8080 agentsave/inferroute:latest

📦 Installation

SDK (all agent frameworks):

v0.1.0 is not yet on PyPI. Install directly from GitHub until the first release is tagged and published:

# Install from GitHub (current)
pip install git+https://github.com/aks-builds/agentsave.git

# With specific framework support:
pip install "git+https://github.com/aks-builds/agentsave.git#egg=agentsave[langchain]"
pip install "git+https://github.com/aks-builds/agentsave.git#egg=agentsave[all]"

Once v0.1.0 is released to PyPI (trigger with Actions → release → Run workflow):

pip install agentsave

# With specific framework support:
pip install "agentsave[langchain]"     # LangChain + LangGraph
pip install "agentsave[autogen]"       # AutoGen
pip install "agentsave[crewai]"        # CrewAI
pip install "agentsave[smolagents]"    # Smolagents
pip install "agentsave[all]"           # All frameworks

Dashboard Backend:

git clone https://github.com/aks-builds/agentsave-dashboard
cd agentsave-dashboard
pip install -e ".[dev]"
uvicorn agentsave_dashboard.main:app --port 8000

Dashboard UI:

git clone https://github.com/aks-builds/agentsave-ui
cd agentsave-ui
npm install
npm run dev   # http://localhost:3000

InferRoute (Enterprise, requires Docker):

docker run -d -p 8080:8080 \
  -e BACKEND_URL=http://your-vllm:8000 \
  -e BACKEND_TYPE=vllm \
  -e AGENTSAVE_TOKEN=$ENTERPRISE_TOKEN \
  agentsave/inferroute:latest

🏗 Architecture

Drop-in, zero-modification: supervise(agent) wraps any agent framework without touching internals
LLM-free context filter: TF-IDF cosine similarity — no extra API calls, <1ms overhead per observation
ICLR 2026 research-backed: 29.68% token reduction on GAIA benchmark (arXiv:2510.26585)
Five framework adapters: LangChain, LangGraph, AutoGen, CrewAI, Smolagents — all tested
InferRoute PPD routing: ~68% Turn 2+ TTFT reduction via append-prefill decode routing (ICML 2026, arXiv:2603.13358)
Opt-in telemetry: zero PII — only run_id, framework, model, token counts, success flag
Self-hostable: dashboard backend is MIT-licensed FastAPI + SQLite; InferRoute is a Dockerfile drop-in

🗺 Roadmap

v0.2:

JavaScript/TypeScript SDK for Node.js agent frameworks
Real-time WebSocket events for the live feed
Team workspaces with RBAC

v0.3:

OpenAI Responses API adapter
Anthropic tool_use adapter
Cost anomaly alerts (email + webhook when a run exceeds threshold)

Tracked as GitHub Issues.

📁 Project Structure

agentsave/              ← SDK (this repo)
├── agentsave/          ← Python package
│   ├── core/           ← context filter, early exit, budget gate, supervisor
│   ├── adapters/       ← LangChain, LangGraph, AutoGen, CrewAI, Smolagents
│   ├── telemetry/      ← opt-in async telemetry client
│   └── cli/            ← agentsave login/status/config
└── tests/              ← 60 unit tests

agentsave-dashboard/    ← FastAPI + SQLite backend
├── agentsave_dashboard/
│   ├── routers/        ← /api/events, /api/metrics, /api/tokens, /api/billing
│   └── services/       ← metrics aggregation, Stripe billing
└── tests/              ← 47 tests

agentsave-ui/           ← Next.js 16 dashboard
├── app/
│   ├── components/     ← StatCard, charts, RunsTable, ActivityFeed, CommandPalette
│   └── (routes)/       ← /, /analytics, /runs, /frameworks, /cost, /settings
└── tests/e2e/          ← 30 Playwright tests

agentsave-inferroute/   ← Enterprise inference router
├── inferroute/
│   ├── classifier.py   ← Turn 1 vs Turn 2+ detection
│   ├── router.py       ← PPD scoring function
│   └── adapters/       ← vLLM + SGLang
└── tests/              ← 59 tests

🤝 Contributing

See CONTRIBUTING.md for setup instructions, code style, and the PR checklist.

📄 License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsave-0.1.0.tar.gz (904.5 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentsave-0.1.0-py3-none-any.whl (18.6 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file agentsave-0.1.0.tar.gz.

File metadata

Download URL: agentsave-0.1.0.tar.gz
Upload date: Jun 23, 2026
Size: 904.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for agentsave-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2066d0f77bc26b7a8c695ebd56b9dc43140cd20cc7cd3fdd5187fc0ac9612a73`
MD5	`292c57d876567a534f0f00f8cc8a9cc9`
BLAKE2b-256	`96955a00c12af42831909997ce3b68c62df7f8511732a05c2e764b26b6bbe68d`

See more details on using hashes here.

File details

Details for the file agentsave-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentsave-0.1.0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 18.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for agentsave-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3684d9d8b4b7580d518b433d3957549ed6d3ff671538b60fd21dba76c9e7365f`
MD5	`c8f342c96dc5e336743c8c92adab78b8`
BLAKE2b-256	`247165a2f7e29443823f092742d2452ccab26679385a04718f31f6cc7604a07e`

See more details on using hashes here.

agentsave 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentSave — Save 30% on AI agent costs. One line of code.

🔥 The Problem

⚡ The Solution

SDK Layer

Dashboard Layer

InferRoute Layer

🎬 In Action

🚀 Quick Start

📦 Installation

🏗 Architecture

🗺 Roadmap

📁 Project Structure

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes