Ollama for free cloud inference. Local OpenAI-compatible gateway routing across OpenRouter, Groq, NVIDIA NIM, Cloudflare Workers AI, HuggingFace, Cerebras, and your own Ollama with automatic failover.
Project description
FreeRide
The OpenAI-compatible gateway for every free-tier provider.
One local endpoint that fans out across OpenRouter, Groq, NVIDIA NIM, HuggingFace, Cerebras, Cloudflare Workers AI, and your own Ollama. Hit a rate limit, fail over to the next provider. Your agent never knows.
Also wraps Claude Code, OpenAI Codex, and Google Gemini CLI — run all three without paying any of their vendors.
102M+ tokens served in 35 days. $0 spent. Routed through community free-tier keys via this gateway. Daily traffic: free-ride.xyz/models
curl -sSL https://api.free-ride.xyz/install.sh | sh
freeride run claude # or: freeride run codex / freeride run gemini
That's it. No accounts, no subscriptions, no FreeRide cloud. Local-first, BYO keys, your machine talks to providers directly.
Install
macOS / Linux:
curl -sSL https://api.free-ride.xyz/install.sh | sh
Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://api.free-ride.xyz/install.ps1 | iex"
The installer picks up uv, then pipx, then plain pip — whichever is on your system. Or install from source.
freeride init # interactive — collects keys, writes ~/.freeride/.env
freeride serve # gateway listens on localhost:11343
Get keys (any one is enough; more = better failover):
| Provider | Free tier | Get a key |
|---|---|---|
| OpenRouter | rotating free models | openrouter.ai/keys |
| Groq | daily token cap | console.groq.com/keys |
| NVIDIA NIM | credits per account | build.nvidia.com |
| HuggingFace | $0.10/mo Free, $2/mo PRO | huggingface.co/settings/tokens |
| Cerebras | RPM / TPM caps | cloud.cerebras.ai |
| Cloudflare Workers AI | 10K neurons/day | dash.cloudflare.com |
| Ollama (local) | no quota | install from ollama.com |
Run a coding agent — free
Three of the major coding CLIs ship a freeride run wrapper that works with no per-vendor key and no login. The gateway translates between each CLI's native wire protocol and our routing layer; you get the polished agent UX of each CLI, paid for entirely by free-tier providers.
Claude Code
freeride run claude
Inside the session, switch routing per request via /model:
| You type | What happens |
|---|---|
/model claude-opus-4-7 |
Your Pro/Max subscription answers (passthrough to api.anthropic.com) — only if claude login has run |
/model freeride/free |
Free providers answer; smart-router picks the model |
/model freeride/fast |
Free; prefers Groq (low TTFT) |
/model freeride/quality |
Free; prefers OpenRouter (widest catalog) |
/model freeride/coding |
Free; pinned to a code-tuned model that reliably emits tool_use blocks |
Full guide: docs/agents/claude-code.md.
OpenAI Codex
freeride run codex
Whatever model the CLI picks (gpt-5-codex, gpt-5, etc.) is routed to a free upstream provider. The gateway translates the Responses-API wire format (with full SSE event protocol — response.output_item.added → output_text.delta → output_item.done → response.completed) so the CLI parses everything natively.
Note: codex uses bubblewrap for shell-tool sandboxing; on systems without it, file/shell tool calls fail (the model still works). Full guide: docs/agents/codex.md.
Google Gemini CLI
freeride run gemini
Any gemini-* model name routes to a free upstream provider. Translator handles Google's {contents, tools, generationConfig} shape both directions. Full guide: docs/agents/gemini.md.
Any other agent / SDK
# Aider / Continue.dev / hermes / your-own-tool — anything that speaks OpenAI:
freeride bind aider
freeride bind continue
# or just point it at the gateway directly:
OPENAI_API_BASE=http://localhost:11343/v1
OPENAI_API_KEY=any-string-here
How failover works
Per-request the chain is (provider, key), sorted by recent health:
- Try the head pair.
RATE_LIMITorAUTHerror → mark the key as cooling, try the next key on the same provider.MODEL_NOT_FOUNDorQUOTA_EXHAUSTED→ skip to the next provider.- 5xx / TIMEOUT → next pair.
- First successful response — stamp
X-FreeRide-Provider+X-FreeRide-Request-Idheaders and ship.
If every pair fails, you get a structured 503 with a per-provider breakdown so debugging is one log line, not five round-trips. Mid-stream errors after the first chunk shipped are logged but don't break the client (we can't un-ship bytes).
Smart routing for model: "auto": the resolver scores every free model in the catalog by health × popularity (from the public models leaderboard) and picks the best one. Run freeride audit-models once after install to cache health probes locally so the first real request isn't a cold start.
Deeper: docs/architecture/failover.md.
Providers
| Provider | Surface | Notes |
|---|---|---|
| OpenRouter | chat, streaming, tools, vision, structured outputs, embeddings | full surface — the most-used provider in our routing |
| NVIDIA NIM | chat + embeddings | curated free-model allowlist; NVIDIA_NIM_FREE_MODELS_OVERRIDE to expand |
| Groq | chat | Llama 3.x, Gemma 2, Mixtral, DeepSeek-R1-distill; daily token cap |
| Cloudflare Workers AI | chat | cheap-per-neuron models; needs CLOUDFLARE_ACCOUNT_ID |
| HuggingFace Inference | chat + embeddings | full HF router catalog; budget governs access |
| Cerebras | chat | fastest Llama / Qwen inference; no embeddings |
| Ollama (local) | chat | local-only; can mix with remote in the same failover chain |
Adding a new provider: implement freeride.core.provider.Provider in freeride/providers/<name>.py, register it in the conformance suite. See CONTRIBUTING.md.
Multi-key rotation
Provide more than one key per provider with a numbered suffix:
OPENROUTER_API_KEY=sk-or-v1-aaa # primary
OPENROUTER_API_KEY_2=sk-or-v1-bbb
OPENROUTER_API_KEY_3=sk-or-v1-ccc
The router tries them in health order. A 429 on one key cools it for the next 60s and rotates to the sibling key — no provider switch needed. On startup freeride keys shows which keys are available vs cooling.
See what the gateway is doing
freeride doctor # static checks: keys, ports, /etc/hosts, common gotchas
freeride doctor --claude-code # the same + Claude-Code-specific probes
freeride audit-models # probe every free model on every key; cache the results
freeride bench # measure p50/p95/tok-s per provider
Tail live events:
tail -f ~/.freeride/events.jsonl
Each line is a JSON event: routing decisions, provider attempts, response statuses, mid-stream errors. Same schema the marketing site reads to render the live token counter and provider leaderboard.
Telemetry
A small beacon ships hourly with counts only: tokens served, request count, active providers, uptime hours, OS, version, and a per-install UUID. Never sent: prompts, completions, model IDs, API keys, hostname, IP.
freeride telemetry # audit what the next beacon would post
freeride telemetry off # opt out
The aggregate is what powers free-ride.xyz/models. Default on; explicit disclosure banner prints on first run.
Commands
freeride init interactive setup wizard — prompts for keys, writes ~/.freeride/.env
freeride serve start the gateway on :11343
freeride run <cli> wrap a CLI (claude / codex / gemini / anything) — points it at the gateway
freeride bind <agent> write the agent's config so it uses the gateway permanently
freeride doctor pre-flight checks: keys, ports, hosts file, common gotchas
freeride keys which provider keys are available vs cooling
freeride audit-models probe every free model; cache health locally
freeride bench measure p50/p95/tok-s per provider
freeride list list available free models
freeride telemetry manage the hourly aggregate beacon
Docs
- Agents
docs/agents/claude-code.md— Claude Code setup,/modelmodes, troubleshootingdocs/agents/codex.md— OpenAI Codex setup, bwrap notes, model selectiondocs/agents/gemini.md— Google Gemini CLI setup, auth flow, model selectiondocs/agents/binders.md— Aider, Continue, OpenClaw — per-agentfreeride bindreferencedocs/agents/hermes.md— NousResearch Hermes agent integration
- Providers
docs/providers/SURVEY.md— per-provider fit (auth, free-tier semantics, error mapping)docs/providers/nvidia_nim.md— NVIDIA NIM specifics
- Architecture
docs/architecture/failover.md— failover chain, cooldown, health trackingdocs/architecture/translators.md— how the Anthropic / Google / OpenAI-Responses translators work
- Other
CONTRIBUTING.md— adding a provider, a CLI wrapper, or a binderSECURITY.md— reporting vulnerabilities
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file freeride_gateway-0.4.0a20.tar.gz.
File metadata
- Download URL: freeride_gateway-0.4.0a20.tar.gz
- Upload date:
- Size: 361.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bc6bf00a4a0f51c1cd4e69c522f984420cbbca15dcaa7e55ec18c5be4a6207c
|
|
| MD5 |
37118de47a6167ed8e79dfea1f6edbaa
|
|
| BLAKE2b-256 |
908679a9963a0b9e911a9eecce774135b911b6bd8131ba5f745c5c8f3e740f42
|
Provenance
The following attestation bundles were made for freeride_gateway-0.4.0a20.tar.gz:
Publisher:
release.yml on Shaivpidadi/FreeRideV3
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
freeride_gateway-0.4.0a20.tar.gz -
Subject digest:
9bc6bf00a4a0f51c1cd4e69c522f984420cbbca15dcaa7e55ec18c5be4a6207c - Sigstore transparency entry: 1671664219
- Sigstore integration time:
-
Permalink:
Shaivpidadi/FreeRideV3@d94e40df50eb05c89cbec506c192601995411196 -
Branch / Tag:
refs/tags/v0.4.0a20 - Owner: https://github.com/Shaivpidadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d94e40df50eb05c89cbec506c192601995411196 -
Trigger Event:
push
-
Statement type:
File details
Details for the file freeride_gateway-0.4.0a20-py3-none-any.whl.
File metadata
- Download URL: freeride_gateway-0.4.0a20-py3-none-any.whl
- Upload date:
- Size: 206.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69f5fbcaf0536908be3ed378617aa1f127a0652b51e9f05b8fe08d14536d8a86
|
|
| MD5 |
64cc31496a337d8cda3c867fba7d3454
|
|
| BLAKE2b-256 |
83bf9921c32c73d481239e594a26f6118d0ca2ce5c2344628467a08d5730c02a
|
Provenance
The following attestation bundles were made for freeride_gateway-0.4.0a20-py3-none-any.whl:
Publisher:
release.yml on Shaivpidadi/FreeRideV3
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
freeride_gateway-0.4.0a20-py3-none-any.whl -
Subject digest:
69f5fbcaf0536908be3ed378617aa1f127a0652b51e9f05b8fe08d14536d8a86 - Sigstore transparency entry: 1671664236
- Sigstore integration time:
-
Permalink:
Shaivpidadi/FreeRideV3@d94e40df50eb05c89cbec506c192601995411196 -
Branch / Tag:
refs/tags/v0.4.0a20 - Owner: https://github.com/Shaivpidadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d94e40df50eb05c89cbec506c192601995411196 -
Trigger Event:
push
-
Statement type: