Skip to main content

AI Infrastructure OS — same engine, four facades (CLI, REST, web, MCP). Reduce 80-95% of AI spend by routing tasks to the right model.

Project description

OmniAgent

You are overspending on AI.

OmniAgent routes every AI task to the most efficient model automatically. Local first. Cloud only when it pays off. 80–97% savings on your AI bill.

Try the AI Cost Calculator · See the 60-second demo · Star on GitHub

MIT License Python 3.11+ Tests Zero Telemetry Local First MIT 100%


The math most devs don't realize

"I just need AI to review my code, write docstrings, and rename things." — every developer with a $500/month Cursor + Claude bill

Here's what you actually need (benchmarked on real hardware, Jun 2026):

Task All-Claude reality OmniAgent Savings
Review a function for bugs Claude · $0.30 qwen2.5-coder:7b (local) · $0.00 100%
Write a Google-style docstring Claude · $0.28 qwen2.5-coder:7b (local) · $0.00 100%
Rename a variable Claude · $0.15 qwen2.5-coder:7b (local) · $0.00 100%
Explain TCP vs UDP Claude · $0.10 qwen2.5-coder:7b (local) · $0.00 100%
Classify a bug ticket Claude · $0.08 qwen2.5-coder:7b (local) · $0.00 100%

Fleet benchmark · MSI desktop (GTX 1650, 4GB VRAM, 8 threads) · 5 tasks · 506 tokens: $0.00 total cloud spend.

OmniAgent uses Claude when Claude is the right tool. It just doesn't use Claude when Qwen can do the same job at 1% the cost.


The 60-second demo

Real benchmark run on MSI desktop (GTX 1650, 4GB VRAM, 8 threads):

→ msi-node: qwen2.5-coder:7b | $0.00 | 30,123ms  (review function)
→ msi-node: qwen2.5-coder:7b | $0.00 | 50,475ms  (write docstring)
→ msi-node: qwen2.5-coder:7b | $0.00 | 10,181ms  (rename variable)
→ msi-node: qwen2.5-coder:7b | $0.00 | 15,321ms  (explain TCP/UDP)
→ msi-node: qwen2.5-coder:7b | $0.00 |  8,649ms  (classify ticket)

Total: 5 tasks · 506 tokens · $0.00 cloud spend · avg 22.95s/task

Every task ran on local GPU. Zero cloud cost. That's what "AI Infrastructure OS" means.


Why OmniAgent exists

The AI industry is in an efficiency crisis:

  • 73% of prompts sent to frontier models could be handled by smaller local models
  • Developers burn $500–$1000/month on Cursor + Claude + GPT with no visibility into what each line costs
  • Agents hallucinate APIs, break production code, leak secrets, forget to commit — and you find out at 2 AM
  • Massive energy waste: a single city could run on the daily inference cycles of one frontier API call
  • Lock-in: one IDE, one provider, one pricing tier
  • No coordination between local hardware, cloud APIs, VPS nodes, and the billions of idle GPUs sitting in garages and offices worldwide

The models will keep changing. The hardware will keep evolving. The only permanent problem is: how do you orchestrate all this intelligence efficiently, securely, and cheaply?

That's what OmniAgent solves.


What it is (and what it isn't)

OmniAgent is not a model. Not an agent. Not a chatbot.

OmniAgent is the operating system that coordinates the entire AI ecosystem — models, agents, hardware, costs, and security — so you stop wasting compute, money, and trust.

Think of it as:

  • Linux doesn't create every app, but everything runs on it.
  • Kubernetes doesn't build every container, but it orchestrates them all.
  • Steam doesn't develop every game, but it hosts them.

OmniAgent doesn't compete with OpenAI, Anthropic, DeepSeek, or your favorite open-source model. It makes all of them work together intelligently.


The 4 façades: one engine, four ways to use it

Façade Audience What you get
CLI (omniagent route "task") Developers, power users Full control, scriptable, fits in any pipeline
Web app (omniagent web) Everyone, especially non-devs 5-tab dashboard on http://localhost:8765 — visualize routing, hardware, optimize
YAML agents (*.yaml in ~/.omniagent/agents/) Agent authors, teams Declarative, shareable, version-controlled — see docs/agents.md
MCP tools (via any MCP client) Tool integrators 6 tools: route, classify, decide, audit, deploy, optimize

Same Python engine. Four ways to use it. You pick the one that fits your workflow.


The 90/9/1 design

  • 90% of users never touch the CLI. They open http://localhost:8765, type a task, see the routing, hit Run it ▶.
  • 9% of users open the Optimize tab, see what they're overspending on, and one-click install a cheaper agent.
  • 1% of users write their own YAML agents, publish them, share them.

The dashboard is the product. The YAML is the protocol. The CLI is the power tool.


How it works (under the hood)

  1. Task arrives — text in the CLI, the web, or via MCP
  2. TaskClassifier — 10 categories, 5 complexities, detects vision / function-calling
  3. AgentRegistry — finds the right agent (project > user > builtin, YAML-defined)
  4. SmartRouter — picks the right model given the agent's constraints + your budget
  5. AdaptiveRouter — combines all of the above into a single RoutingDecision
  6. LLM call — local first, cloud only if budget + quality demand it
  7. CostTracker — logs the spend, feeds back into the next routing decision
  8. Guardian++ — pre / during / post audit on every action (secret scan, command sandbox, commit verification)

364 unit tests + 7 integration tests validate every step.


Quickstart (60 seconds)

git clone https://github.com/landrover1984/omniagent
cd omniagent
pip install -e .
omniagent web
# open http://localhost:8765

Or use the CLI directly:

omniagent agent-route "review this code for security" --budget 0.10
omniagent agent-list                     # see all available agents
omniagent agent-install ./my-agent.yaml  # add your own
omniagent optimize                       # find cheaper routes
omniagent cost-report                    # what you've spent
omniagent agent-decide "design a cache"  # see the routing (no LLM call)

Zero API keys needed to start. Local models via Ollama work out of the box.


What ships today

Layer Status Tests
Agent Protocol (YAML agents) Shipped 18
Task Classifier (10 categories) Shipped 20
AdaptiveRouter (the brain) Shipped 8
5-tab Web UI Shipped 13 endpoints
Cost Optimizer (the killer feature) Shipped 3
Anti-Hallucination Audit (Guardian++) Shipped 23
Hybrid Deploy (local / VPS / AWS) Shipped 28
MCP Server (6 tools) Shipped 18
CLI commands 20+ 50+
Total 364 passing, 2 skipped

Roadmap

Phase Theme Status
v0.1.x AI Infrastructure OS — routing, cost, optimize, local-first Shipped
v0.2.x Optimization Layer — replay mode, "Claude unnecessary" detector, savings reports Next
v0.3.x Visual Dashboard — real-time cost graphs, agent analytics, team view Planned
v0.4.x Distributed Compute — idle GPU federation, opt-in mesh Deferred
v0.5.x Marketplace + Incentives — community YAMLs, reputation, rewards Deferred

We are not building another "AI wrapper". We are building the coordination layer that the entire AI ecosystem needs.

Distributed compute and marketplace are real, but they're not the wedge. The wedge is: stop overspending on AI. Get that right first.


License

MIT — 100% open source, forever. No paid tier, no "enterprise edition", no bait-and-switch.


The models will change. The hardware will change. The coordination layer is permanent.

Star on GitHub · Try the Cost Calculator · Write your first agent

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniagent_fleet-0.1.4.tar.gz (122.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniagent_fleet-0.1.4-py3-none-any.whl (129.0 kB view details)

Uploaded Python 3

File details

Details for the file omniagent_fleet-0.1.4.tar.gz.

File metadata

  • Download URL: omniagent_fleet-0.1.4.tar.gz
  • Upload date:
  • Size: 122.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for omniagent_fleet-0.1.4.tar.gz
Algorithm Hash digest
SHA256 49f4db3c4c7f1f3773d6c4f6ffb835f5a412bd6ffca8ee75baf46727fce39ef5
MD5 fdebbca48df0186f682e071411d57831
BLAKE2b-256 928a0df020aaea1877303cc60bf27f757b037fc2066ff8f5ceb1d93457ab0b69

See more details on using hashes here.

File details

Details for the file omniagent_fleet-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for omniagent_fleet-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ada5103fa930b74df991f69e5a3c4e2e7f303342eb66a5027aab47c9e2d86579
MD5 3f8f963bcadf7d5e680835e0c64e5cc8
BLAKE2b-256 8e44e63fcfe549c5c1c2f675181b4e56aace167f415533ace544803b3d20b35f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page