AI Infrastructure OS — same engine, four facades (CLI, REST, web, MCP). Reduce 80-95% of AI spend by routing tasks to the right model.

These details have not been verified by PyPI

Project links

Project description

OmniAgent

You are overspending on AI.

OmniAgent routes every AI task to the most efficient model automatically. Local first. Cloud only when it pays off. 80–97% savings on your AI bill.

Try the AI Cost Calculator · See the 60-second demo · Star on GitHub

The math most devs don't realize

"I just need AI to review my code, write docstrings, and rename things." — every developer with a $500/month Cursor + Claude bill

Here's what you actually need (benchmarked on real hardware, Jun 2026):

Task	All-Claude reality	OmniAgent	Savings
Review a function for bugs	Claude · $0.30	qwen2.5-coder:7b (local) · $0.00	100%
Write a Google-style docstring	Claude · $0.28	qwen2.5-coder:7b (local) · $0.00	100%
Rename a variable	Claude · $0.15	qwen2.5-coder:7b (local) · $0.00	100%
Explain TCP vs UDP	Claude · $0.10	qwen2.5-coder:7b (local) · $0.00	100%
Classify a bug ticket	Claude · $0.08	qwen2.5-coder:7b (local) · $0.00	100%

Fleet benchmark · MSI desktop (GTX 1650, 4GB VRAM, 8 threads) · 5 tasks · 506 tokens: $0.00 total cloud spend.

OmniAgent uses Claude when Claude is the right tool. It just doesn't use Claude when Qwen can do the same job at 1% the cost.

The 60-second demo

Real benchmark run on MSI desktop (GTX 1650, 4GB VRAM, 8 threads):

→ msi-node: qwen2.5-coder:7b | $0.00 | 30,123ms  (review function)
→ msi-node: qwen2.5-coder:7b | $0.00 | 50,475ms  (write docstring)
→ msi-node: qwen2.5-coder:7b | $0.00 | 10,181ms  (rename variable)
→ msi-node: qwen2.5-coder:7b | $0.00 | 15,321ms  (explain TCP/UDP)
→ msi-node: qwen2.5-coder:7b | $0.00 |  8,649ms  (classify ticket)

Total: 5 tasks · 506 tokens · $0.00 cloud spend · avg 22.95s/task

Every task ran on local GPU. Zero cloud cost. That's what "AI Infrastructure OS" means.

Weekly AI Intelligence — `omniagent post-mortem`

"You don't need a budget. You need to see what you spent."

v0.2.0 adds the killer first-run experience: a persistent cost ledger + a Weekly AI Intelligence report.

Every omniagent agent-route and omniagent fleet route call is now logged to ~/.omniagent/postmortem/ledger.db. Then run:

omniagent post-mortem                  # last 7 days
omniagent post-mortem --period month   # last 30 days
omniagent post-mortem --period all     # all time
omniagent post-mortem --json | jq      # pipe to tools
omniagent post-mortem -o weekly.md     # save the report
omniagent post-mortem --demo           # inject sample data and see the report

Sample output (with --demo):

# 🧠 Weekly AI Intelligence

_Generated 2026-06-03 · Period: last 7 days_

## 💰 Top-line numbers

| Metric | Value |
|---|---:|
| Tasks run | 10 |
| Tokens (in + out) | 23,770 |
| **Total cost** | **$0.3046** |
| ↳ Local | $0.0000 |
| ↳ Cloud | $0.3046 |
| All-Claude-Sonnet equivalent | $0.1777 |
| **Savings vs all-Claude** | **-71.4%** ⚠️ |

> 💡 You spent $0.3046 on cloud models.
> $0.2894 of that (≈95%) probably could have been local.

## ⚡ Top optimization opportunities

**Total potential savings: $0.0991**

### 1. claude-sonnet-4 → qwen2.5-coder:7b
- Calls: 6 · Tokens: 12,570
- Actual cost: $0.0936 · Could have been: $0.0000
- Savings: **$0.0936** · Risk: low

### 2. gpt-4o → qwen2.5-coder:7b
- Calls: 1 · Tokens: 1,000
- Actual cost: $0.0055 · Could have been: $0.0000
- Savings: **$0.0055** · Risk: low

### 3. claude-opus-4-5 → no alt
- Calls: 1 · Tokens: 5,300
- Actual cost: $0.2055 · Could have been: $0.2055
- Savings: $0.0000 · Risk: high (frontier reasoning)

The "risk" field is honest. Frontier models (Opus, o1) get risk=high and local_alternative=null — you really did need that model. Trivial and simple tasks get risk=low and a concrete local alternative. No misleading savings claims.

Same data is available from the web: GET /api/postmortem?period=week.

Why OmniAgent exists

The AI industry is in an efficiency crisis:

73% of prompts sent to frontier models could be handled by smaller local models
Developers burn $500–$1000/month on Cursor + Claude + GPT with no visibility into what each line costs
Agents hallucinate APIs, break production code, leak secrets, forget to commit — and you find out at 2 AM
Massive energy waste: a single city could run on the daily inference cycles of one frontier API call
Lock-in: one IDE, one provider, one pricing tier
No coordination between local hardware, cloud APIs, VPS nodes, and the billions of idle GPUs sitting in garages and offices worldwide

The models will keep changing. The hardware will keep evolving. The only permanent problem is: how do you orchestrate all this intelligence efficiently, securely, and cheaply?

That's what OmniAgent solves.

What it is (and what it isn't)

OmniAgent is not a model. Not an agent. Not a chatbot.

OmniAgent is the operating system that coordinates the entire AI ecosystem — models, agents, hardware, costs, and security — so you stop wasting compute, money, and trust.

Think of it as:

Linux doesn't create every app, but everything runs on it.
Kubernetes doesn't build every container, but it orchestrates them all.
Steam doesn't develop every game, but it hosts them.

OmniAgent doesn't compete with OpenAI, Anthropic, DeepSeek, or your favorite open-source model. It makes all of them work together intelligently.

The 4 façades: one engine, four ways to use it

Façade	Audience	What you get
CLI (`omniagent route "task"`)	Developers, power users	Full control, scriptable, fits in any pipeline
Web app (`omniagent web`)	Everyone, especially non-devs	5-tab dashboard on `http://localhost:8765` — visualize routing, hardware, optimize
YAML agents (`*.yaml` in `~/.omniagent/agents/`)	Agent authors, teams	Declarative, shareable, version-controlled — see docs/agents.md
MCP tools (via any MCP client)	Tool integrators	6 tools: route, classify, decide, audit, deploy, optimize

Same Python engine. Four ways to use it. You pick the one that fits your workflow.

The 90/9/1 design

90% of users never touch the CLI. They open http://localhost:8765, type a task, see the routing, hit Run it ▶.
9% of users open the Optimize tab, see what they're overspending on, and one-click install a cheaper agent.
1% of users write their own YAML agents, publish them, share them.

The dashboard is the product. The YAML is the protocol. The CLI is the power tool.

How it works (under the hood)

Task arrives — text in the CLI, the web, or via MCP
TaskClassifier — 10 categories, 5 complexities, detects vision / function-calling
AgentRegistry — finds the right agent (project > user > builtin, YAML-defined)
SmartRouter — picks the right model given the agent's constraints + your budget
AdaptiveRouter — combines all of the above into a single RoutingDecision
LLM call — local first, cloud only if budget + quality demand it
CostTracker — logs the spend, feeds back into the next routing decision
Guardian++ — pre / during / post audit on every action (secret scan, command sandbox, commit verification)

414 unit tests + 13 integration tests validate every step.

Quickstart (60 seconds)

git clone https://github.com/landrover1984/omniagent
cd omniagent
pip install -e .
omniagent web
# open http://localhost:8765

Or use the CLI directly:

omniagent agent-route "review this code for security" --budget 0.10
omniagent agent-list                     # see all available agents
omniagent agent-install ./my-agent.yaml  # add your own
omniagent optimize                       # find cheaper routes
omniagent post-mortem                    # weekly AI intelligence
omniagent agent-decide "design a cache"  # see the routing (no LLM call)

Zero API keys needed to start. Local models via Ollama work out of the box.

What ships today

Layer	Status	Tests
Agent Protocol (YAML agents)	Shipped	18
Task Classifier (10 categories)	Shipped	20
AdaptiveRouter (the brain)	Shipped	8
5-tab Web UI + Post-Mortem API	Shipped	16 endpoints
Cost Optimizer (the killer feature)	Shipped	3
Post-Mortem (Weekly AI Intelligence)	Shipped v0.2.0	47
Anti-Hallucination Audit (Guardian++)	Shipped	23
Hybrid Deploy (local / VPS / AWS)	Shipped	28
MCP Server (6 tools)	Shipped	18
Private Fleet (multi-node)	Shipped v0.1.4	10
CLI commands	25+	70+
Total		414 passing, 2 skipped

Roadmap

Phase	Theme	Status
v0.1.x	AI Infrastructure OS — routing, cost, optimize, local-first	Shipped
v0.2.0	Weekly AI Intelligence — persistent cost ledger, post-mortem reports, savings opportunities	Shipped
v0.2.x	Agent Generator — code → custom YAML agents	Next
v0.3.x	AI Firewall — privacy, PII detection, compliance mode	Planned
v0.4.x	Visual Dashboard — real-time cost graphs, agent analytics, team view	Planned
v0.5.x	Distributed Compute — idle GPU federation, opt-in mesh	Deferred
v0.6.x	Marketplace + Incentives — community YAMLs, reputation, rewards	Deferred

We are not building another "AI wrapper". We are building the coordination layer that the entire AI ecosystem needs.

Distributed compute and marketplace are real, but they're not the wedge. The wedge is: stop overspending on AI. Get that right first.

License

MIT — 100% open source, forever. No paid tier, no "enterprise edition", no bait-and-switch.

The models will change. The hardware will change. The coordination layer is permanent.

Star on GitHub · Try the Cost Calculator · Write your first agent

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Jun 3, 2026

This version

0.2.0

Jun 3, 2026

0.1.4

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniagent_fleet-0.2.0.tar.gz (136.0 kB view details)

Uploaded Jun 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omniagent_fleet-0.2.0-py3-none-any.whl (141.6 kB view details)

Uploaded Jun 3, 2026 Python 3

File details

Details for the file omniagent_fleet-0.2.0.tar.gz.

File metadata

Download URL: omniagent_fleet-0.2.0.tar.gz
Upload date: Jun 3, 2026
Size: 136.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for omniagent_fleet-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`5fcd867fc9cdb32938b6d33c9772ec331be9aba4fee8f602d0d7551a567c7ae1`
MD5	`df545d8bfed8e368dafb67142ee57f75`
BLAKE2b-256	`d77aaff1f42056dd0b3465c33353eb3f4310c91267776e643a6a1d024da2a5a3`

See more details on using hashes here.

File details

Details for the file omniagent_fleet-0.2.0-py3-none-any.whl.

File metadata

Download URL: omniagent_fleet-0.2.0-py3-none-any.whl
Upload date: Jun 3, 2026
Size: 141.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for omniagent_fleet-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3f1fa0dab3153e91f29deeba4c5e24607f241dad978d662473f47ac0eb6796e1`
MD5	`27d26799471b3edc3dfec02a39901042`
BLAKE2b-256	`eaf2f125eb7d11cf056fea0dcdadcb4574195046a1703f41931f9c10e9980282`

See more details on using hashes here.

omniagent-fleet 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OmniAgent

You are overspending on AI.

The math most devs don't realize

The 60-second demo

Weekly AI Intelligence — `omniagent post-mortem`

Why OmniAgent exists

What it is (and what it isn't)

The 4 façades: one engine, four ways to use it

The 90/9/1 design

How it works (under the hood)

Quickstart (60 seconds)

What ships today

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

omniagent-fleet 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OmniAgent

You are overspending on AI.

The math most devs don't realize

The 60-second demo

Weekly AI Intelligence — omniagent post-mortem

Why OmniAgent exists

What it is (and what it isn't)

The 4 façades: one engine, four ways to use it

The 90/9/1 design

How it works (under the hood)

Quickstart (60 seconds)

What ships today

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Weekly AI Intelligence — `omniagent post-mortem`