Skip to main content

Smart Gemini API key manager with token-aware sliding window scheduling

Project description

# gemini-flux ๐Ÿ”ฅ

> Give it your N Gemini API keys. It manages everything.

**Author:** Muhammad Ali โ€” malikasana2810@gmail.com

---

## Why This Exists

If you've ever used the Gemini API on the free tier, you've hit this wall:

429 RESOURCE_EXHAUSTED โ€” You exceeded your current quota.


The frustrating part? Google lets you create multiple projects, each with its own independent API key and quota. So the question becomes:

> *"Why am I managing these keys manually when a smart system could do it for me?"*

That's exactly what **gemini-flux** solves.

Most key rotation tools are dumb โ€” they round-robin every X seconds regardless of what's actually happening. gemini-flux is different. It knows how many tokens your request contains, calculates the exact cooldown needed per key, and sends each request at the **earliest mathematically possible moment** โ€” no unnecessary waiting, no wasted quota.

Built originally for a dubbing application that needed to send large translation requests (instructions + transcripts) continuously without hitting rate limits. Works for any use case.

---

## How It Works

### The Core Problem
Gemini free tier limit per project:
- **250,000 tokens per minute (TPM)**

If you send a 500,000 token request, that key needs **2 minutes** to recover:

cooldown = token_count / tokens_per_minute cooldown = 500,000 / 250,000 = 2 minutes


### Token-Aware Sliding Window Scheduling

Instead of blindly rotating every 30 seconds, gemini-flux:

1. **Counts tokens** (FREE via Google's API) before every request
2. **Calculates exact cooldown** needed per key based on actual token usage
3. **Maintains a sliding window** per key tracking token usage over last 60 seconds
4. **Picks the key with enough capacity RIGHT NOW** โ€” no unnecessary waiting
5. If no key is ready โ†’ waits **exactly** as long as needed for the soonest available key

### Dynamic Interval Math

worst_case_interval = cooldown / n_keys

8 keys, 1M token request: cooldown = 1,000,000 / 250,000 = 240 seconds interval = 240 / 8 = 30 seconds between requests

8 keys, 10k token request: cooldown = 10,000 / 250,000 = 2.4 seconds interval = 2.4 / 8 = 0.3 seconds โ€” nearly instant!


The system adapts automatically. Light requests โ†’ fast. Heavy requests โ†’ smart cooldown.

### Model Exhaustion Chain

When a model's daily quota is hit on a key, gemini-flux automatically moves to the next model โ€” not because it failed, but because it's **exhausted for the day**:

  1. gemini-2.5-pro โ†’ smartest (100 RPD)
  2. gemini-2.5-flash โ†’ fast + smart (250 RPD) โ† main workhorse
  3. gemini-2.5-flash-lite โ†’ lightweight (1000 RPD)
  4. gemini-3.1-pro-preview โ†’ newest pro generation
  5. gemini-3-flash-preview โ†’ newest flash generation
  6. gemini-3.1-flash-lite-preview โ†’ newest lite generation

### Smart Policy Fetcher

On startup, gemini-flux uses **1 request** to ask Gemini about its own free tier limits, then uses those numbers for all internal math. Result is cached for 7 days โ€” so you only spend 1 request on setup per week, not every run. If Google changes their limits tomorrow โ†’ gemini-flux catches it automatically on next refresh.

---

## Total Free Capacity (8 keys)

| Model | RPD per key | x 8 keys | Daily total |
|-------|------------|----------|-------------|
| gemini-2.5-pro | 100 | x 8 | 800/day |
| gemini-2.5-flash | 250 | x 8 | 2,000/day |
| gemini-2.5-flash-lite | 1000 | x 8 | 8,000/day |
| Preview models | varies | x 8 | bonus! |
| **TOTAL** | | | **10,800+/day** |

All completely free. No credit card needed.

---

## Quick Start

### Option 1: Direct Python (Kaggle, scripts, notebooks)

```bash
git clone https://github.com/malikasana/gemini-flux
cd gemini-flux
pip install -r requirements.txt
cp .env.example .env

Fill in your keys in .env, then:

from core import GeminiFlux

flux = GeminiFlux(
    keys=["key1", "key2", "key3", "key4", "key5", "key6", "key7", "key8"],
    mode="both",
    log=True
)

response = flux.generate("Translate this transcript to Spanish...")
print(response["response"])

Option 2: Docker Microservice

docker build -t gemini-flux .
docker run -p 8000:8000 --env-file .env gemini-flux

Then from any app:

from client.client import GeminiFluxClient

client = GeminiFluxClient(base_url="http://localhost:8000")
response = client.generate("Translate this transcript to Spanish...")
print(response["response"])

Option 3: Kaggle Notebook

!git clone https://github.com/malikasana/gemini-flux
!pip install google-genai pytz python-dotenv

import sys
sys.path.insert(0, "/kaggle/working/gemini-flux")
from core import GeminiFlux

flux = GeminiFlux(keys=["key1", "key2", ...])
response = flux.generate("your prompt here")

Setup โ€” Getting Your API Keys

Step 1: Create Google Cloud Projects

  1. Go to console.cloud.google.com
  2. Create up to 10 projects โ€” each gets independent quota
  3. Name them anything โ€” e.g. project-1, project-2

Pro tip: Use multiple Google accounts to go beyond 10 projects. Each account allows up to 10 projects independently.

Step 2: Get API Keys

For each project:

  1. Go to APIs & Services โ†’ Credentials
  2. Click Create Credentials โ†’ API Key
  3. Copy the key

Step 3: Add to .env

Copy .env.example to .env and fill in your keys:

GEMINI_KEY_1=AIza...your_key_1
GEMINI_KEY_2=AIza...your_key_2
GEMINI_KEY_3=AIza...your_key_3
GEMINI_KEY_4=AIza...your_key_4
GEMINI_KEY_5=AIza...your_key_5
GEMINI_KEY_6=AIza...your_key_6
GEMINI_KEY_7=AIza...your_key_7
GEMINI_KEY_8=AIza...your_key_8

GEMINI_MODE=both
GEMINI_LOG=true

Usage

Minimum input (just works):

response = flux.generate("your prompt here")

Full control:

response = flux.generate(
    prompt="Translate this transcript to Urdu with natural dubbing tone...",
    images=["base64_image..."],
    files=["base64_pdf..."],
    mode="flash_only",
    preferred_key=3,
    max_tokens=2000,
    temperature=0.5,
    retry=True
)

Response always includes:

{
    "response": "Gemini's reply...",
    "key_used": 3,
    "model_used": "gemini-2.5-flash",
    "tokens_used": 45231,
    "wait_applied": 1.8,
    "retried": False
}

Runtime Controls

flux.set_mode("flash_only")    # change mode anytime
flux.disable_key(3)            # disable key #3
flux.enable_key(3)             # re-enable key #3
flux.refresh_policy()          # force re-fetch Gemini policy
flux.status()                  # full key pool status

Mode options:

Mode Description
both Full exhaustion chain, best to lite (default)
pro_only Only Pro models
flash_only Only Flash models
flash_lite_only Only Flash-Lite models

HTTP API (Docker Mode)

Endpoint Method Description
/generate POST Send prompt, get response
/status GET Key pool status and usage
/refresh-policy POST Force policy re-fetch
/config POST Change mode, enable/disable keys
/health GET Health check

Example:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Translate to Spanish: Hello world"}'

Console Output

==================================================
  gemini-flux ๐Ÿ”ฅ  Starting up with 8 keys
==================================================

[STARTUP] Checking 8 keys...
[KEY 1] โœ… Healthy
[KEY 2] โœ… Healthy
[KEY 3] โš ๏ธ  Exhausted โ€” will reset at midnight PT
[KEY 4] โŒ Invalid โ€” removed from pool
[STARTUP] Pool ready: 6 healthy, 1 exhausted, 1 invalid

[MODELS] Exhaustion chain:
  1. gemini-2.5-pro
  2. gemini-2.5-flash
  3. gemini-2.5-flash-lite
  4. gemini-3.1-pro-preview
  5. gemini-3-flash-preview
  6. gemini-3.1-flash-lite-preview

[POLICY] Using cached policy (1.2 days old)
[STARTUP] Dynamic interval: 240s / 6 keys = 40.0s (worst case)
[STARTUP] โœ… gemini-flux ready! Mode: BOTH

[REQUEST] Incoming โ€” 450,000 tokens detected
[SCHEDULER] Key #2 selected โ€” sending via gemini-2.5-flash
[RESPONSE] โœ… Success via Key #2 (gemini-2.5-flash)
[KEY 2] gemini-2.5-flash: 1/250 requests used today

Docker Environment Variables

Variable Default Description
GEMINI_KEY_1 ... GEMINI_KEY_N required Your API keys
GEMINI_MODE both Model mode
GEMINI_LOG true Console logging

Project Structure

gemini-flux/
โ”œโ”€โ”€ core/
โ”‚   โ”œโ”€โ”€ __init__.py           # Public interface
โ”‚   โ”œโ”€โ”€ flux.py               # Main GeminiFlux class
โ”‚   โ”œโ”€โ”€ scheduler.py          # Token-aware sliding window brain
โ”‚   โ”œโ”€โ”€ key_pool.py           # Key validation and tracking
โ”‚   โ””โ”€โ”€ policy.py             # Smart policy fetcher
โ”œโ”€โ”€ service/
โ”‚   โ””โ”€โ”€ main.py               # FastAPI microservice
โ”œโ”€โ”€ client/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ””โ”€โ”€ client.py             # Lightweight HTTP client
โ”œโ”€โ”€ .env.example              # Environment template
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ test.py
โ””โ”€โ”€ README.md

Security

  • Never commit your .env file โ€” it's in .gitignore by default
  • Use .env.example as a template โ€” contains no real keys
  • Each key validated on startup โ€” invalid keys removed immediately

License

MIT License โ€” free to use, modify, and distribute.


Author

Muhammad Ali malikasana2810@gmail.com


Built out of frustration with rate limits. Powered by math.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_flux-1.0.0.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gemini_flux-1.0.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file gemini_flux-1.0.0.tar.gz.

File metadata

  • Download URL: gemini_flux-1.0.0.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for gemini_flux-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ebf92dcfb6c7d049929dbf5e603a9610f32cacb9dbf2b432573146ee98b0a3e7
MD5 706f77c145311be227b8fb3664f5256a
BLAKE2b-256 ad32b1f4e5b9ff743a331308fdcf28ddf8b2e8ac2d18552206e744630bbe9973

See more details on using hashes here.

File details

Details for the file gemini_flux-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: gemini_flux-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for gemini_flux-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7f8e87436641cccfa7a95eb5438586a9a4808d49f8b822e2a8acd6e708378b48
MD5 4dcae72e5e9e6ee23da0002f56fa828e
BLAKE2b-256 b83b592c9eebc91e4e574999678a1d3e71c0765eae785397199002e7719ee59b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page