Ollama for free cloud inference. Local OpenAI-compatible gateway routing across OpenRouter, Groq, NVIDIA NIM, Cloudflare Workers AI, HuggingFace, Cerebras, and your own Ollama with automatic failover.
Project description
FreeRide
Ollama for free cloud inference.
A local OpenAI-compatible gateway that routes across every free-tier provider you have a key for — OpenRouter, Groq, NVIDIA NIM, Cloudflare Workers AI, HuggingFace, Cerebras, and your own Ollama. Hits a rate limit, fails over. Your agent never knows.
Install
macOS / Linux:
curl -sSL https://api.free-ride.xyz/install.sh | sh
Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://api.free-ride.xyz/install.ps1 | iex"
Then:
freeride init # interactive — collects keys, writes ~/.freeride/.env
freeride serve # gateway listens on localhost:11343
Point any OpenAI-shaped client at http://localhost:11343/v1 with OPENAI_API_KEY=any. That's it.
The installer bootstraps uv if missing, then uv tool installs freeride-gateway. Binary lands at ~/.local/bin/freeride (Linux/macOS) or %USERPROFILE%\.local\bin\freeride.exe (Windows). Same shape as the bun.sh and astral.sh installers.
Or install manually
# uv (what the installer does)
uv tool install --prerelease=allow freeride-gateway
# pipx
pipx install --pip-args=--pre freeride-gateway
# pip + venv (the venv only — re-activate per shell)
python3 -m venv .venv && source .venv/bin/activate
pip install --pre freeride-gateway
# from source
git clone https://github.com/Shaivpidadi/FreeRideV3 && cd FreeRideV3
pip install -e .
PyPI distribution: freeride-gateway. CLI: freeride. Python ≥ 3.10.
Get keys (any one is enough; more = better failover)
| Provider | Where | Env var |
|---|---|---|
| OpenRouter | https://openrouter.ai/keys | OPENROUTER_API_KEY |
| Groq | https://console.groq.com/keys | GROQ_API_KEY |
| NVIDIA NIM | https://build.nvidia.com | NVIDIA_API_KEY |
| Cloudflare Workers AI | https://dash.cloudflare.com/profile/api-tokens | CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID |
| HuggingFace | https://huggingface.co/settings/tokens | HF_TOKEN |
| Cerebras | https://cloud.cerebras.ai/platform | CEREBRAS_API_KEY |
| Ollama (local) | https://ollama.com/download | OLLAMA_BASE_URL=http://localhost:11434 |
Set whichever you have, then freeride serve. The gateway picks them up and rotates between them.
Or use the wizard: freeride init writes ~/.freeride/.env for you. The gateway auto-loads that file at startup — no manual source needed.
Wire your agent
The fastest way is a binder:
freeride bind aider # writes ~/.aider.conf.yml
freeride bind continue # writes ~/.continue/config.yaml
freeride bind hermes # writes ~/.hermes/config.yaml
freeride bind openclaw # writes ~/.openclaw/openclaw.json
Or set the OpenAI vars yourself:
export OPENAI_API_BASE=http://localhost:11343/v1
export OPENAI_API_KEY=any
Anything OpenAI-shaped works. Tested with the openai-python SDK, Aider, Continue, Hermes, OpenClaw.
Multi-key rotation
Got several free keys for the same provider? Pass them as a JSON array:
export OPENROUTER_API_KEY='["sk-or-v1-key1","sk-or-v1-key2","sk-or-v1-key3"]'
When key 1 hits 429 it goes on cooldown for 120s; key 2 takes the next request. Cooldowns persist across restarts (~/.freeride/cooldown.json).
How failover works
Per request, FreeRide walks (provider, key) pairs in order:
RATE_LIMITorAUTH→ mark this key cooling, try the next key.MODEL_NOT_FOUND→ skip this provider, try the next provider.- Anything 5xx-ish → next pair.
- First successful response → ship it; stamp
X-FreeRide-Providerheader (or_freeride_providerfield on JSON) so you can tell who actually served it.
Streaming uses buffer-first-chunk failover: hold the first SSE event until upstream confirms the stream is real. If it fails before the first chunk, retry. After the first chunk has shipped, mid-stream errors propagate (rare; documented).
Telemetry
On by default. Hourly POST to https://telemetry.free-ride.xyz/v1/beacon:
{
"installation_id": "random-uuid-v4",
"version": "0.3.0",
"os": "darwin",
"tokens_served": 412034,
"request_count": 187,
"providers_active": ["openrouter", "groq"],
"uptime_hours": 8
}
Prompts, completions, model IDs, API keys, hostnames, IPs — never sent. The Worker doesn't log cf-connecting-ip. The first time you run any freeride command a banner prints the exact payload.
freeride telemetry off # turn it off
freeride telemetry # show what would be sent
Embeddings
Same endpoint shape as OpenAI's /v1/embeddings. Failover across the
4 providers that support embeddings (Groq doesn't):
curl http://localhost:11343/v1/embeddings \
-H 'Content-Type: application/json' \
-d '{"model": "text-embedding-3-small", "input": "hello world"}'
The same X-FreeRide-Provider header tells you which provider served
the embedding. Same multi-key rotation, same per-provider failover.
See what FreeRide is doing
freeride watch
Tails live failover events from a running gateway. Every request, every provider attempt, every rate-limit, every retry. Useful for seeing failover happen in real time, debugging "is my agent actually using FreeRide", or just demoing.
[14:23:01.412] req_a3f8e2c1 ▶ request model=openrouter/free stream
[14:23:01.421] req_a3f8e2c1 → openrouter[k0] openrouter/free
[14:23:01.833] req_a3f8e2c1 ← openrouter[k0] 412ms RATE_LIMIT ✗ (retry-after 47s)
[14:23:01.835] req_a3f8e2c1 → groq[k0] openrouter/free
[14:23:02.153] req_a3f8e2c1 ← groq[k0] 318ms OK ✓ first-chunk
[14:23:02.154] req_a3f8e2c1 ■ complete via groq
Events are written to ~/.freeride/events.jsonl. Opt out with
FREERIDE_EVENTS=0 if you don't want them. File caps at 1 MiB with
single-backup rotation.
Commands
freeride serve start the gateway
freeride bind <agent> write gateway URL into agent config
freeride watch tail live failover events
freeride bench per-provider latency comparison (needs serve running)
freeride reload refresh provider registry from env vars (no restart)
freeride providers live provider health from a running gateway
freeride doctor diagnose common setup issues (env vars, PATH, port)
freeride upgrade bump installed package to latest PyPI release
freeride init interactive setup wizard — prompts for keys, writes ~/.freeride/.env
freeride keys show which provider keys are available vs cooling
freeride telemetry [on|off] manage telemetry
freeride list list available free models
freeride status show OpenClaw config + cache age (v2)
freeride auto auto-configure OpenClaw (v2)
freeride rotate swap primary if it fails (v2)
freeride-watcher background daemon that rotates on failure
freeride bench example output:
$ freeride bench
Benchmarking 5 providers, 3 requests each via http://localhost:11343/v1...
provider ok p50 p95 tok/s
─────────────────────────────────────────────────────
groq 3/3 142ms 287ms 98
cloudflare_wai 3/3 284ms 410ms 81
nvidia_nim 3/3 389ms 502ms 72
openrouter 3/3 412ms 721ms 63
huggingface 2/3 612ms 1840ms 41
Fastest: groq (142ms p50)
The v2 commands keep working for existing OpenClaw users.
Providers
| Provider | Status | Notes |
|---|---|---|
| OpenRouter | shipped | full surface — chat, streaming, tools, vision, structured outputs |
| NVIDIA NIM | shipped | curated free-model allowlist; NVIDIA_NIM_FREE_MODELS_OVERRIDE to expand |
| Groq | shipped | hardcoded allowlist (Llama 3.x, Gemma 2, Mixtral, DeepSeek-R1-distill); GROQ_FREE_MODELS_OVERRIDE to expand |
| Cloudflare Workers AI | shipped | curated allowlist of cheap-per-neuron chat models; needs CLOUDFLARE_ACCOUNT_ID |
| HuggingFace Inference | shipped | full HF router catalog; budget governs access ($0.10/mo Free, $2/mo PRO) |
| Cerebras | shipped | fastest Llama / Qwen inference; chat-only (no embeddings). CEREBRAS_FREE_MODELS_OVERRIDE to restrict catalog. |
| Ollama (local) | shipped | local-only; mix with remote providers in the same failover chain. Set OLLAMA_BASE_URL to opt in. |
Adding a sixth: implement freeride.core.provider.Provider (api_version=1) in freeride/providers/<name>.py, register it in the conformance suite, done. See CONTRIBUTING.md.
Agents
| Agent | freeride bind |
Hot reload |
|---|---|---|
| OpenClaw | yes | needs restart |
| Aider | yes (--scope home/cwd/git) |
needs restart |
| Continue | yes | yes |
| Hermes (NousResearch/hermes-agent) | yes | needs restart |
Or anything else: OPENAI_API_BASE=http://localhost:11343/v1 + OPENAI_API_KEY=any.
Claude Code skill
If you use Claude Code, install the FreeRide skill so Claude knows how to detect, wire, and troubleshoot the gateway:
/plugin install https://github.com/Shaivpidadi/FreeRideV3
After install, Claude auto-invokes the skill when you mention FreeRide,
have it running on localhost:11343, or ask about routing across
free-tier providers. See skills/README.md for
manual-install instructions.
Docs
docs/providers/SURVEY.md— Provider Protocol fit per provider (auth shape, free-tier semantics, error mapping)docs/providers/nvidia_nim.md— NVIDIA NIM specifics (free-model allowlist, 403=AUTH quirk)docs/agent-binders.md— per-agent bind reference (config locations, hot-reload behavior, edge cases)docs/hermes.md— Hermes identification + bind planCONTRIBUTING.md— adding a provider or binder
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file freeride_gateway-0.4.0a4.tar.gz.
File metadata
- Download URL: freeride_gateway-0.4.0a4.tar.gz
- Upload date:
- Size: 201.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
450f29a2b6917278283ae9fb796f3c3ffba19b66f04b13f5fc9f6c317e3029a7
|
|
| MD5 |
7da3aa76be73ad5b9aeb3d6893d0db2b
|
|
| BLAKE2b-256 |
91d8aa94b4f0b75f3eb16fc52f8a809fd75cd7c6a020c55675c5dc7581ab8142
|
Provenance
The following attestation bundles were made for freeride_gateway-0.4.0a4.tar.gz:
Publisher:
release.yml on Shaivpidadi/FreeRideV3
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
freeride_gateway-0.4.0a4.tar.gz -
Subject digest:
450f29a2b6917278283ae9fb796f3c3ffba19b66f04b13f5fc9f6c317e3029a7 - Sigstore transparency entry: 1494963302
- Sigstore integration time:
-
Permalink:
Shaivpidadi/FreeRideV3@68cc0d1e78b6f90f67ee61d16e89b1bc647687d5 -
Branch / Tag:
refs/tags/v0.4.0a4 - Owner: https://github.com/Shaivpidadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@68cc0d1e78b6f90f67ee61d16e89b1bc647687d5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file freeride_gateway-0.4.0a4-py3-none-any.whl.
File metadata
- Download URL: freeride_gateway-0.4.0a4-py3-none-any.whl
- Upload date:
- Size: 136.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44ee227eff4ebc04691c359b243147cb1edb0c879f816eb4d9ce10797972187e
|
|
| MD5 |
68ed6614ab5fb8955615875e81c36792
|
|
| BLAKE2b-256 |
c20976a098607118c75fc2ddc7e222e2db17e525f57d6e6b76c19491c45f8613
|
Provenance
The following attestation bundles were made for freeride_gateway-0.4.0a4-py3-none-any.whl:
Publisher:
release.yml on Shaivpidadi/FreeRideV3
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
freeride_gateway-0.4.0a4-py3-none-any.whl -
Subject digest:
44ee227eff4ebc04691c359b243147cb1edb0c879f816eb4d9ce10797972187e - Sigstore transparency entry: 1494963396
- Sigstore integration time:
-
Permalink:
Shaivpidadi/FreeRideV3@68cc0d1e78b6f90f67ee61d16e89b1bc647687d5 -
Branch / Tag:
refs/tags/v0.4.0a4 - Owner: https://github.com/Shaivpidadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@68cc0d1e78b6f90f67ee61d16e89b1bc647687d5 -
Trigger Event:
push
-
Statement type: