Smart Gemini API key manager with token-aware sliding window scheduling
Project description
# gemini-flux ๐ฅ
> Give it your N Gemini API keys. It manages everything.
**Author:** Muhammad Ali โ malikasana2810@gmail.com
---
## Why This Exists
If you've ever used the Gemini API on the free tier, you've hit this wall:
429 RESOURCE_EXHAUSTED โ You exceeded your current quota.
The frustrating part? Google lets you create multiple projects, each with its own independent API key and quota. So the question becomes:
> *"Why am I managing these keys manually when a smart system could do it for me?"*
That's exactly what **gemini-flux** solves.
Most key rotation tools are dumb โ they round-robin every X seconds regardless of what's actually happening. gemini-flux is different. It knows how many tokens your request contains, calculates the exact cooldown needed per key, and sends each request at the **earliest mathematically possible moment** โ no unnecessary waiting, no wasted quota.
Built originally for a dubbing application that needed to send large translation requests (instructions + transcripts) continuously without hitting rate limits. Works for any use case.
---
## How It Works
### The Core Problem
Gemini free tier limit per project:
- **250,000 tokens per minute (TPM)**
If you send a 500,000 token request, that key needs **2 minutes** to recover:
cooldown = token_count / tokens_per_minute cooldown = 500,000 / 250,000 = 2 minutes
### Token-Aware Sliding Window Scheduling
Instead of blindly rotating every 30 seconds, gemini-flux:
1. **Counts tokens** (FREE via Google's API) before every request
2. **Calculates exact cooldown** needed per key based on actual token usage
3. **Maintains a sliding window** per key tracking token usage over last 60 seconds
4. **Picks the key with enough capacity RIGHT NOW** โ no unnecessary waiting
5. If no key is ready โ waits **exactly** as long as needed for the soonest available key
### Dynamic Interval Math
worst_case_interval = cooldown / n_keys
8 keys, 1M token request: cooldown = 1,000,000 / 250,000 = 240 seconds interval = 240 / 8 = 30 seconds between requests
8 keys, 10k token request: cooldown = 10,000 / 250,000 = 2.4 seconds interval = 2.4 / 8 = 0.3 seconds โ nearly instant!
The system adapts automatically. Light requests โ fast. Heavy requests โ smart cooldown.
### Model Exhaustion Chain
When a model's daily quota is hit on a key, gemini-flux automatically moves to the next model โ not because it failed, but because it's **exhausted for the day**:
- gemini-2.5-pro โ smartest (100 RPD)
- gemini-2.5-flash โ fast + smart (250 RPD) โ main workhorse
- gemini-2.5-flash-lite โ lightweight (1000 RPD)
- gemini-3.1-pro-preview โ newest pro generation
- gemini-3-flash-preview โ newest flash generation
- gemini-3.1-flash-lite-preview โ newest lite generation
### Smart Policy Fetcher
On startup, gemini-flux uses **1 request** to ask Gemini about its own free tier limits, then uses those numbers for all internal math. Result is cached for 7 days โ so you only spend 1 request on setup per week, not every run. If Google changes their limits tomorrow โ gemini-flux catches it automatically on next refresh.
---
## Total Free Capacity (8 keys)
| Model | RPD per key | x 8 keys | Daily total |
|-------|------------|----------|-------------|
| gemini-2.5-pro | 100 | x 8 | 800/day |
| gemini-2.5-flash | 250 | x 8 | 2,000/day |
| gemini-2.5-flash-lite | 1000 | x 8 | 8,000/day |
| Preview models | varies | x 8 | bonus! |
| **TOTAL** | | | **10,800+/day** |
All completely free. No credit card needed.
---
## Quick Start
### Option 1: Direct Python (Kaggle, scripts, notebooks)
```bash
git clone https://github.com/malikasana/gemini-flux
cd gemini-flux
pip install -r requirements.txt
cp .env.example .env
Fill in your keys in .env, then:
from core import GeminiFlux
flux = GeminiFlux(
keys=["key1", "key2", "key3", "key4", "key5", "key6", "key7", "key8"],
mode="both",
log=True
)
response = flux.generate("Translate this transcript to Spanish...")
print(response["response"])
Option 2: Docker Microservice
docker build -t gemini-flux .
docker run -p 8000:8000 --env-file .env gemini-flux
Then from any app:
from client.client import GeminiFluxClient
client = GeminiFluxClient(base_url="http://localhost:8000")
response = client.generate("Translate this transcript to Spanish...")
print(response["response"])
Option 3: Kaggle Notebook
!git clone https://github.com/malikasana/gemini-flux
!pip install google-genai pytz python-dotenv
import sys
sys.path.insert(0, "/kaggle/working/gemini-flux")
from core import GeminiFlux
flux = GeminiFlux(keys=["key1", "key2", ...])
response = flux.generate("your prompt here")
Setup โ Getting Your API Keys
Step 1: Create Google Cloud Projects
- Go to console.cloud.google.com
- Create up to 10 projects โ each gets independent quota
- Name them anything โ e.g.
project-1,project-2
Pro tip: Use multiple Google accounts to go beyond 10 projects. Each account allows up to 10 projects independently.
Step 2: Get API Keys
For each project:
- Go to APIs & Services โ Credentials
- Click Create Credentials โ API Key
- Copy the key
Step 3: Add to .env
Copy .env.example to .env and fill in your keys:
GEMINI_KEY_1=AIza...your_key_1
GEMINI_KEY_2=AIza...your_key_2
GEMINI_KEY_3=AIza...your_key_3
GEMINI_KEY_4=AIza...your_key_4
GEMINI_KEY_5=AIza...your_key_5
GEMINI_KEY_6=AIza...your_key_6
GEMINI_KEY_7=AIza...your_key_7
GEMINI_KEY_8=AIza...your_key_8
GEMINI_MODE=both
GEMINI_LOG=true
Usage
Minimum input (just works):
response = flux.generate("your prompt here")
Full control:
response = flux.generate(
prompt="Translate this transcript to Urdu with natural dubbing tone...",
images=["base64_image..."],
files=["base64_pdf..."],
mode="flash_only",
preferred_key=3,
max_tokens=2000,
temperature=0.5,
retry=True
)
Response always includes:
{
"response": "Gemini's reply...",
"key_used": 3,
"model_used": "gemini-2.5-flash",
"tokens_used": 45231,
"wait_applied": 1.8,
"retried": False
}
Runtime Controls
flux.set_mode("flash_only") # change mode anytime
flux.disable_key(3) # disable key #3
flux.enable_key(3) # re-enable key #3
flux.refresh_policy() # force re-fetch Gemini policy
flux.status() # full key pool status
Mode options:
| Mode | Description |
|---|---|
both |
Full exhaustion chain, best to lite (default) |
pro_only |
Only Pro models |
flash_only |
Only Flash models |
flash_lite_only |
Only Flash-Lite models |
HTTP API (Docker Mode)
| Endpoint | Method | Description |
|---|---|---|
/generate |
POST | Send prompt, get response |
/status |
GET | Key pool status and usage |
/refresh-policy |
POST | Force policy re-fetch |
/config |
POST | Change mode, enable/disable keys |
/health |
GET | Health check |
Example:
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "Translate to Spanish: Hello world"}'
Console Output
==================================================
gemini-flux ๐ฅ Starting up with 8 keys
==================================================
[STARTUP] Checking 8 keys...
[KEY 1] โ
Healthy
[KEY 2] โ
Healthy
[KEY 3] โ ๏ธ Exhausted โ will reset at midnight PT
[KEY 4] โ Invalid โ removed from pool
[STARTUP] Pool ready: 6 healthy, 1 exhausted, 1 invalid
[MODELS] Exhaustion chain:
1. gemini-2.5-pro
2. gemini-2.5-flash
3. gemini-2.5-flash-lite
4. gemini-3.1-pro-preview
5. gemini-3-flash-preview
6. gemini-3.1-flash-lite-preview
[POLICY] Using cached policy (1.2 days old)
[STARTUP] Dynamic interval: 240s / 6 keys = 40.0s (worst case)
[STARTUP] โ
gemini-flux ready! Mode: BOTH
[REQUEST] Incoming โ 450,000 tokens detected
[SCHEDULER] Key #2 selected โ sending via gemini-2.5-flash
[RESPONSE] โ
Success via Key #2 (gemini-2.5-flash)
[KEY 2] gemini-2.5-flash: 1/250 requests used today
Docker Environment Variables
| Variable | Default | Description |
|---|---|---|
GEMINI_KEY_1 ... GEMINI_KEY_N |
required | Your API keys |
GEMINI_MODE |
both |
Model mode |
GEMINI_LOG |
true |
Console logging |
Project Structure
gemini-flux/
โโโ core/
โ โโโ __init__.py # Public interface
โ โโโ flux.py # Main GeminiFlux class
โ โโโ scheduler.py # Token-aware sliding window brain
โ โโโ key_pool.py # Key validation and tracking
โ โโโ policy.py # Smart policy fetcher
โโโ service/
โ โโโ main.py # FastAPI microservice
โโโ client/
โ โโโ __init__.py
โ โโโ client.py # Lightweight HTTP client
โโโ .env.example # Environment template
โโโ Dockerfile
โโโ requirements.txt
โโโ test.py
โโโ README.md
Security
- Never commit your
.envfile โ it's in.gitignoreby default - Use
.env.exampleas a template โ contains no real keys - Each key validated on startup โ invalid keys removed immediately
License
MIT License โ free to use, modify, and distribute.
Author
Muhammad Ali malikasana2810@gmail.com
Built out of frustration with rate limits. Powered by math.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gemini_flux-1.0.0.tar.gz.
File metadata
- Download URL: gemini_flux-1.0.0.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebf92dcfb6c7d049929dbf5e603a9610f32cacb9dbf2b432573146ee98b0a3e7
|
|
| MD5 |
706f77c145311be227b8fb3664f5256a
|
|
| BLAKE2b-256 |
ad32b1f4e5b9ff743a331308fdcf28ddf8b2e8ac2d18552206e744630bbe9973
|
File details
Details for the file gemini_flux-1.0.0-py3-none-any.whl.
File metadata
- Download URL: gemini_flux-1.0.0-py3-none-any.whl
- Upload date:
- Size: 15.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f8e87436641cccfa7a95eb5438586a9a4808d49f8b822e2a8acd6e708378b48
|
|
| MD5 |
4dcae72e5e9e6ee23da0002f56fa828e
|
|
| BLAKE2b-256 |
b83b592c9eebc91e4e574999678a1d3e71c0765eae785397199002e7719ee59b
|