Your Personal AI Cloud -- intelligent proxy, router, and cache for LLMs

These details have not been verified by PyPI

Project links

Project description

LLMHosts.com

Your hardware. Real AI infrastructure. From anywhere.

LLMHosts turns your local GPU into production AI infrastructure with intelligent routing, verified caching, and global access. One command (llmhosts up) auto-detects your hardware, loads models, and exposes an OpenAI-compatible API. The SaaS platform at llmhosts.com provides cost tracking, plan management, and team features.

Two ways to use it:

Self-hosted CLI — pip install llmhosts and run on your own hardware (FSL open-core; Rust inference crates are Apache-2.0 — see LICENSE-APACHE)
SaaS Platform — Sign up at llmhosts.com for cloud cost tracking, API key management, and team features

Licensing

LLMHosts uses an open-core model under the Functional Source License 1.1 (FSL-1.1-Apache-2.0). All components are FSL-licensed; the table below reflects open-core intent — components free for personal and non-competing use versus those that compete with our hosted service.

Component	Intent	Converts to Apache 2.0
Local inference proxy & router	Open-core (non-competing use free)	2028-02-24
CLI tool (`llmhosts`)	Open-core (non-competing use free)	2028-02-24
Auto-discovery	Open-core (non-competing use free)	2028-02-24
Cloud tunnel management	Proprietary (competing use restricted)	2028-02-24
SaaS platform & billing	Proprietary (competing use restricted)	2028-02-24
Fleet orchestration (Token)	Proprietary (competing use restricted)	2028-02-24

After 2028-02-24, all components convert to Apache 2.0 with no restrictions.

SaaS Platform (llmhosts.com)

Track your AI spending, manage API keys, and get real-time savings projections.

Live at: https://llmhosts.com

Features:

📊 Cost tracking across OpenAI, Anthropic, Google AI, AWS Bedrock, Azure
🔑 API key management with plan-based limits
📈 12-month spending projections with confidence scoring
💰 Real-time savings estimates (illustrative; actual savings depend on workload and routing)
🎯 Gamified achievements for cost milestones
💳 Stripe-powered billing (Pro $29/mo, Team $99/mo, Enterprise $299/mo)
👥 Team management (coming soon)

Quick Start:

Sign up at llmhosts.com
Add your first cost entry
Generate an API key for the CLI proxy
Connect your self-hosted LLMHost proxy to track usage

Self-Hosted CLI

Run the intelligent proxy on your own hardware.

pip install llmhosts
llmhosts serve

Point any OpenAI-compatible tool at http://localhost:4000/v1. Your tools now use your local GPU. Cost: $0.

Why LLMHosts?

Cloud bills add up — Route Cursor, Claude Code, and Aider to your local GPU instead. Same tools, zero API spend.
Your hardware, your control — All inference runs on your machine. No data leaves your network unless you choose.
Works anywhere — llmhosts tunnel uses the built-in LLMHosts Relay (WSS + yamux + Noise NK encryption). Tailscale/Cloudflare are not part of the product path (removed per ADR-009).

Competitive Landscape

We are not entering an existing market — we are creating one. The market: personal and small-team AI infrastructure.

Player	What They Do	Why They Lose
OpenAI / Anthropic	Cloud API	100x more expensive for same hardware quality
Ollama	Local model runner	No remote access, no routing, no SaaS, no batching
LM Studio	Local GUI	Nowhere near production-ready
LocalAI	Self-hosted API	Technical, no UX, no moat
Replicate / Together	Hosted inference	Still cloud cost, no local hardware
LLMHosts	Infrastructure layer	Proxy + Core engine + relay + SaaS — see repo for shipped vs roadmap

Competitive Moat

Five compounding advantages that deepen with every user:

Layer	Name	What It Is
1	First-Mover Position	Building this market category before competition arrives
2	Data Flywheel	Routing telemetry trains better models → better product → more users
3	Simplicity Moat	Works for gamers, researchers, founders — not just DevOps engineers
4	Self-Healing Infrastructure	CI and tooling aim for self-healing; some flows still need operator attention (see issues).
5	Ecosystem Lock-In	Token AI, Hardware Atlas, savings history = high switching cost

Features

Area	Description
Proxy	OpenAI + Anthropic compatible API on port 4000. Drop-in for any client.
Router	Three-tier design (rules → kNN → classifier). Shipped: rules + wiring; partial / in progress: FAISS/ONNX distribution and full ML tiers (see `AGENTS.md` honest completion).
Cache	Tiered cache design (exact → namespace → semantic). Shipped: exact hash path; in progress: full semantic/vCache tiers per roadmap.
Tunnel	`llmhosts tunnel` — self-hosted LLMHosts Relay only (Noise NK). No Tailscale/Cloudflare in the supported product path (ADR-009).
Dashboard	TUI (terminal) + web UI at `/dashboard`. Live request flow, cache stats, model health.
BYOK	Bring your own cloud keys. Fallback to OpenAI/Anthropic when local models can't handle a request.

Quick Start

Install

Three tiers, pick what you need:

pip install llmhosts                    # Core (~50MB) — proxy, router, dashboard
pip install "llmhosts[smart]"          # Smart (~150MB) — + ML router, semantic cache
pip install "llmhosts[full]"           # Full (~2GB) — + PyTorch, full intelligence

Docker:

docker run -p 4000:4000 llmhosts/llmhosts
# GPU: docker run --gpus all -p 4000:4000 llmhosts/llmhosts

Start the Proxy

llmhosts serve

Starts the proxy on http://localhost:4000, auto-detects your GPU, loads models via the built-in Core engine, loads BYOK keys, and launches the TUI dashboard. Web dashboard at http://localhost:4000/dashboard.

Access from Anywhere

The differentiator: make your home GPU reachable from your laptop, phone, or office.

llmhosts tunnel

Uses the LLMHosts Relay (Rust, included in the wheel): WSS + multiplexing + Noise NK end-to-end encryption. You run or connect to a relay endpoint you control — the relay cannot read payload traffic. Tailscale / Cloudflare Tunnel are not supported fallbacks in current product docs (ADR-009).

llmhosts tunnel           # Start / manage relay-based remote access
llmhosts tunnel status    # Check tunnel status
llmhosts tunnel stop      # Stop active tunnel

Works With Everything

Every tool that speaks OpenAI format works. Just set the base URL:

export OPENAI_API_BASE=http://localhost:4000/v1
# Some tools use: export OPENAI_BASE_URL=http://localhost:4000/v1
export OPENAI_API_KEY=anything   # LLMHosts accepts any key for local mode

Tool	How
Cursor	Settings > Models > Custom endpoint: `http://localhost:4000/v1`
Claude Code	Set `OPENAI_API_BASE` or configure base URL in settings
Aider	`aider --api-base http://localhost:4000/v1`
Continue.dev	Add OpenAI-compatible provider, base URL: `http://localhost:4000/v1`
Open WebUI	Set OpenAI API URL to `http://localhost:4000/v1`
Any OpenAI client	`base_url="http://localhost:4000/v1"` in client config

Architecture

Request  →  Proxy (4000)  →  Router  →  vCache  →  Backend
                │              │          │
                │              ├─ Tier 1: Rules
                │              ├─ Tier 2: kNN (FAISS + embeddings — partial ship)
                │              └─ Tier 3: ModernBERT → Qwen-0.5B (tiers vary by install)
                │
                ├─ Cache: exact hash (shipped) → namespace / semantic (roadmap)
                │
                └─ Backend: Core Engine | Cloud API (BYOK)

Commands

Command	Description
`llmhosts serve`	Start proxy + dashboard
`llmhosts tunnel`	Start secure tunnel (LLMHosts Relay + Noise NK; ADR-009)
`llmhosts tunnel status`	Show tunnel status
`llmhosts tunnel stop`	Stop active tunnel
`llmhosts doctor`	Verify setup and dependencies
`llmhosts setup`	Interactive first-run wizard
`llmhosts keys add <provider> <key>`	Add BYOK API key
`llmhosts keys list`	List configured providers
`llmhosts keys validate`	Validate stored keys
`llmhosts cache stats`	Cache hit rates and size
`llmhosts cache clear`	Clear cache
`llmhosts suggest-models`	Recommend models for your hardware

Dashboard

TUI — Built-in terminal UI when you run llmhosts serve. Live request flow, backends, cache activity.
Web — Browser dashboard at http://localhost:4000/dashboard. Request history, cache stats, model health.

Configuration

TOML — ~/.config/llmhosts/config.toml or --config path/to/config.toml
Env — LLMHOSTS_* prefixed variables
CLI — --host, --port, --no-tui, --log-level

Development

docker compose run --rm dev
pip install -e ".[dev]"
llmhosts --version
pytest tests/ -v

Contributing

PRs welcome. Open an issue first for large changes. Run pytest tests/ and ruff check . before submitting.

License

Distribution / Python package: FSL-1.1-Apache-2.0 — see Licensing for open-core intent.
Open-source inference crates (llmhosts_core, relay_core, router_core): Apache-2.0 (SPDX headers in source).

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.14.7

Apr 26, 2026

0.14.6

Apr 25, 2026

0.14.5

Apr 25, 2026

0.14.4

Apr 25, 2026

0.14.3

Apr 24, 2026

0.14.2

Apr 24, 2026

0.14.0

Apr 24, 2026

0.13.7

Apr 22, 2026

0.13.6

Apr 22, 2026

0.13.5

Apr 22, 2026

0.13.4

Apr 22, 2026

0.13.3

Apr 22, 2026

0.13.2

Apr 20, 2026

0.13.1

Apr 7, 2026

This version

0.13.0

Apr 6, 2026

0.12.0

Apr 5, 2026

0.11.7

Mar 31, 2026

0.11.6

Mar 30, 2026

0.11.5

Mar 30, 2026

0.11.4

Mar 30, 2026

0.11.3

Mar 30, 2026

0.11.2

Mar 30, 2026

0.11.1

Mar 30, 2026

0.11.0

Mar 30, 2026

0.10.8

Mar 30, 2026

0.10.7

Mar 30, 2026

0.10.6

Mar 30, 2026

0.10.4

Mar 30, 2026

0.10.3

Mar 30, 2026

0.10.2

Mar 30, 2026

0.10.1

Mar 29, 2026

0.10.0

Mar 29, 2026

0.9.0

Mar 27, 2026

0.8.1

Mar 26, 2026

0.8.0

Mar 26, 2026

0.7.5

Mar 25, 2026

0.7.4

Mar 25, 2026

0.7.3

Mar 25, 2026

0.7.2

Mar 25, 2026

0.7.1

Mar 25, 2026

0.7.0

Mar 25, 2026

0.6.4

Mar 25, 2026

0.6.3

Mar 25, 2026

0.6.2

Mar 25, 2026

0.6.1

Mar 25, 2026

0.6.0

Mar 24, 2026

0.5.15

Mar 24, 2026

0.5.14

Mar 24, 2026

0.5.13

Mar 24, 2026

0.5.12

Mar 24, 2026

0.5.11

Mar 23, 2026

0.5.10

Mar 23, 2026

0.5.9

Mar 23, 2026

0.5.8

Mar 23, 2026

0.5.7

Mar 23, 2026

0.5.6

Mar 23, 2026

0.5.5

Mar 23, 2026

0.5.4

Mar 23, 2026

0.5.3

Mar 23, 2026

0.5.2

Mar 23, 2026

0.5.1

Mar 10, 2026

0.5.0

Feb 28, 2026

0.1.0

Feb 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmhosts-0.13.0-cp312-cp312-win_amd64.whl (7.1 MB view details)

Uploaded Apr 6, 2026 CPython 3.12Windows x86-64

llmhosts-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded Apr 6, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

llmhosts-0.13.0-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded Apr 6, 2026 CPython 3.12macOS 11.0+ ARM64

File details

Details for the file llmhosts-0.13.0-cp312-cp312-win_amd64.whl.

File metadata

Download URL: llmhosts-0.13.0-cp312-cp312-win_amd64.whl
Upload date: Apr 6, 2026
Size: 7.1 MB
Tags: CPython 3.12, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmhosts-0.13.0-cp312-cp312-win_amd64.whl
Algorithm	Hash digest
SHA256	`0068343c4b8e64ad1f365a7fb5c7e360c93d0aed50a02482e7a34fc9e2d5f146`
MD5	`c0a386adc403612709fe0d9ab6f4a098`
BLAKE2b-256	`a9d401f7654b8b0b48873166011cd14b2f69252ba00044e8e4b7ff9ed25cc58d`

See more details on using hashes here.

File details

Details for the file llmhosts-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: llmhosts-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Apr 6, 2026
Size: 6.7 MB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmhosts-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`d3c5716b79e9b2dfc44ac620a465aedaca4ae91930b02ae8efc48e23f0558480`
MD5	`df8605a6c6a8d7c6a3e8ff78ca22f022`
BLAKE2b-256	`a58c703ce89ccd4ba93db60e32ecf28bdc346e9c26f066170011232954778136`

See more details on using hashes here.

File details

Details for the file llmhosts-0.13.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

Download URL: llmhosts-0.13.0-cp312-cp312-macosx_11_0_arm64.whl
Upload date: Apr 6, 2026
Size: 4.9 MB
Tags: CPython 3.12, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmhosts-0.13.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`be774bcb50b00e324c41ff78cd7b5f6712f1c2a6fdb085dd5ab6baccf886c96c`
MD5	`3e875bd8f5aee515aa71d975231f8adb`
BLAKE2b-256	`c0f565c8808c7ec0f445267214ee8c7fdb2ebb42d361ba2540174efeef910024`

See more details on using hashes here.

llmhosts 0.13.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLMHosts.com

Licensing

SaaS Platform (llmhosts.com)

Self-Hosted CLI

Why LLMHosts?

Competitive Landscape

Competitive Moat

Features

Quick Start

Install

Start the Proxy

Access from Anywhere

Works With Everything

Architecture

Commands

Dashboard

Configuration

Development

Contributing

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes