gemini-flux

Smart Gemini API key manager with token-aware sliding window scheduling

These details have not been verified by PyPI

Project links

Project description

# gemini-flux 🔥

> Give it your N Gemini API keys. It manages everything.

**Author:** Muhammad Ali — malikasana2810@gmail.com

---

## Why This Exists

If you've ever used the Gemini API on the free tier, you've hit this wall:

429 RESOURCE_EXHAUSTED — You exceeded your current quota.


The frustrating part? Google lets you create multiple projects, each with its own independent API key and quota. So the question becomes:

> *"Why am I managing these keys manually when a smart system could do it for me?"*

That's exactly what **gemini-flux** solves.

Most key rotation tools are dumb — they round-robin every X seconds regardless of what's actually happening. gemini-flux is different. It knows how many tokens your request contains, calculates the exact cooldown needed per key, and sends each request at the **earliest mathematically possible moment** — no unnecessary waiting, no wasted quota.

Built originally for a dubbing application that needed to send large translation requests (instructions + transcripts) continuously without hitting rate limits. Works for any use case.

---

## How It Works

### The Core Problem
Gemini free tier limit per project:
- **250,000 tokens per minute (TPM)**

If you send a 500,000 token request, that key needs **2 minutes** to recover:

cooldown = token_count / tokens_per_minute cooldown = 500,000 / 250,000 = 2 minutes


### Token-Aware Sliding Window Scheduling

Instead of blindly rotating every 30 seconds, gemini-flux:

1. **Counts tokens** (FREE via Google's API) before every request
2. **Calculates exact cooldown** needed per key based on actual token usage
3. **Maintains a sliding window** per key tracking token usage over last 60 seconds
4. **Picks the key with enough capacity RIGHT NOW** — no unnecessary waiting
5. If no key is ready → waits **exactly** as long as needed for the soonest available key

### Dynamic Interval Math

worst_case_interval = cooldown / n_keys

8 keys, 1M token request: cooldown = 1,000,000 / 250,000 = 240 seconds interval = 240 / 8 = 30 seconds between requests

8 keys, 10k token request: cooldown = 10,000 / 250,000 = 2.4 seconds interval = 2.4 / 8 = 0.3 seconds — nearly instant!


The system adapts automatically. Light requests → fast. Heavy requests → smart cooldown.

### Model Exhaustion Chain

When a model's daily quota is hit on a key, gemini-flux automatically moves to the next model — not because it failed, but because it's **exhausted for the day**:

gemini-2.5-pro → smartest (100 RPD)
gemini-2.5-flash → fast + smart (250 RPD) ← main workhorse
gemini-2.5-flash-lite → lightweight (1000 RPD)
gemini-3.1-pro-preview → newest pro generation
gemini-3-flash-preview → newest flash generation
gemini-3.1-flash-lite-preview → newest lite generation


### Smart Policy Fetcher

On startup, gemini-flux uses **1 request** to ask Gemini about its own free tier limits, then uses those numbers for all internal math. Result is cached for 7 days — so you only spend 1 request on setup per week, not every run. If Google changes their limits tomorrow → gemini-flux catches it automatically on next refresh.

---

## Total Free Capacity (8 keys)

| Model | RPD per key | x 8 keys | Daily total |
|-------|------------|----------|-------------|
| gemini-2.5-pro | 100 | x 8 | 800/day |
| gemini-2.5-flash | 250 | x 8 | 2,000/day |
| gemini-2.5-flash-lite | 1000 | x 8 | 8,000/day |
| Preview models | varies | x 8 | bonus! |
| **TOTAL** | | | **10,800+/day** |

All completely free. No credit card needed.

---

## Quick Start

### Option 1: Direct Python (Kaggle, scripts, notebooks)

```bash
git clone https://github.com/malikasana/gemini-flux
cd gemini-flux
pip install -r requirements.txt
cp .env.example .env

Fill in your keys in .env, then:

from core import GeminiFlux

flux = GeminiFlux(
    keys=["key1", "key2", "key3", "key4", "key5", "key6", "key7", "key8"],
    mode="both",
    log=True
)

response = flux.generate("Translate this transcript to Spanish...")
print(response["response"])

Option 2: Docker Microservice

docker build -t gemini-flux .
docker run -p 8000:8000 --env-file .env gemini-flux

Then from any app:

from client.client import GeminiFluxClient

client = GeminiFluxClient(base_url="http://localhost:8000")
response = client.generate("Translate this transcript to Spanish...")
print(response["response"])

Option 3: Kaggle Notebook

!git clone https://github.com/malikasana/gemini-flux
!pip install google-genai pytz python-dotenv

import sys
sys.path.insert(0, "/kaggle/working/gemini-flux")
from core import GeminiFlux

flux = GeminiFlux(keys=["key1", "key2", ...])
response = flux.generate("your prompt here")

Setup — Getting Your API Keys

Step 1: Create Google Cloud Projects

Go to console.cloud.google.com
Create up to 10 projects — each gets independent quota
Name them anything — e.g. project-1, project-2

Pro tip: Use multiple Google accounts to go beyond 10 projects. Each account allows up to 10 projects independently.

Step 2: Get API Keys

For each project:

Go to APIs & Services → Credentials
Click Create Credentials → API Key
Copy the key

Step 3: Add to .env

Copy .env.example to .env and fill in your keys:

GEMINI_KEY_1=AIza...your_key_1
GEMINI_KEY_2=AIza...your_key_2
GEMINI_KEY_3=AIza...your_key_3
GEMINI_KEY_4=AIza...your_key_4
GEMINI_KEY_5=AIza...your_key_5
GEMINI_KEY_6=AIza...your_key_6
GEMINI_KEY_7=AIza...your_key_7
GEMINI_KEY_8=AIza...your_key_8

GEMINI_MODE=both
GEMINI_LOG=true

Usage

Minimum input (just works):

response = flux.generate("your prompt here")

Full control:

response = flux.generate(
    prompt="Translate this transcript to Urdu with natural dubbing tone...",
    images=["base64_image..."],
    files=["base64_pdf..."],
    mode="flash_only",
    preferred_key=3,
    max_tokens=2000,
    temperature=0.5,
    retry=True
)

Response always includes:

{
    "response": "Gemini's reply...",
    "key_used": 3,
    "model_used": "gemini-2.5-flash",
    "tokens_used": 45231,
    "wait_applied": 1.8,
    "retried": False
}

Runtime Controls

flux.set_mode("flash_only")    # change mode anytime
flux.disable_key(3)            # disable key #3
flux.enable_key(3)             # re-enable key #3
flux.refresh_policy()          # force re-fetch Gemini policy
flux.status()                  # full key pool status

Mode options:

Mode	Description
`both`	Full exhaustion chain, best to lite (default)
`pro_only`	Only Pro models
`flash_only`	Only Flash models
`flash_lite_only`	Only Flash-Lite models

HTTP API (Docker Mode)

Endpoint	Method	Description
`/generate`	POST	Send prompt, get response
`/status`	GET	Key pool status and usage
`/refresh-policy`	POST	Force policy re-fetch
`/config`	POST	Change mode, enable/disable keys
`/health`	GET	Health check

Example:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Translate to Spanish: Hello world"}'

Console Output

==================================================
  gemini-flux 🔥  Starting up with 8 keys
==================================================

[STARTUP] Checking 8 keys...
[KEY 1] ✅ Healthy
[KEY 2] ✅ Healthy
[KEY 3] ⚠️  Exhausted — will reset at midnight PT
[KEY 4] ❌ Invalid — removed from pool
[STARTUP] Pool ready: 6 healthy, 1 exhausted, 1 invalid

[MODELS] Exhaustion chain:
  1. gemini-2.5-pro
  2. gemini-2.5-flash
  3. gemini-2.5-flash-lite
  4. gemini-3.1-pro-preview
  5. gemini-3-flash-preview
  6. gemini-3.1-flash-lite-preview

[POLICY] Using cached policy (1.2 days old)
[STARTUP] Dynamic interval: 240s / 6 keys = 40.0s (worst case)
[STARTUP] ✅ gemini-flux ready! Mode: BOTH

[REQUEST] Incoming — 450,000 tokens detected
[SCHEDULER] Key #2 selected — sending via gemini-2.5-flash
[RESPONSE] ✅ Success via Key #2 (gemini-2.5-flash)
[KEY 2] gemini-2.5-flash: 1/250 requests used today

Docker Environment Variables

Variable	Default	Description
`GEMINI_KEY_1` ... `GEMINI_KEY_N`	required	Your API keys
`GEMINI_MODE`	`both`	Model mode
`GEMINI_LOG`	`true`	Console logging

Project Structure

gemini-flux/
├── core/
│   ├── __init__.py           # Public interface
│   ├── flux.py               # Main GeminiFlux class
│   ├── scheduler.py          # Token-aware sliding window brain
│   ├── key_pool.py           # Key validation and tracking
│   └── policy.py             # Smart policy fetcher
├── service/
│   └── main.py               # FastAPI microservice
├── client/
│   ├── __init__.py
│   └── client.py             # Lightweight HTTP client
├── .env.example              # Environment template
├── Dockerfile
├── requirements.txt
├── test.py
└── README.md

Security

Never commit your .env file — it's in .gitignore by default
Use .env.example as a template — contains no real keys
Each key validated on startup — invalid keys removed immediately

License

MIT License — free to use, modify, and distribute.

Author

Muhammad Ali malikasana2810@gmail.com

Built out of frustration with rate limits. Powered by math.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_flux-1.0.0.tar.gz (16.6 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gemini_flux-1.0.0-py3-none-any.whl (15.3 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file gemini_flux-1.0.0.tar.gz.

File metadata

Download URL: gemini_flux-1.0.0.tar.gz
Upload date: Apr 17, 2026
Size: 16.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for gemini_flux-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`ebf92dcfb6c7d049929dbf5e603a9610f32cacb9dbf2b432573146ee98b0a3e7`
MD5	`706f77c145311be227b8fb3664f5256a`
BLAKE2b-256	`ad32b1f4e5b9ff743a331308fdcf28ddf8b2e8ac2d18552206e744630bbe9973`

See more details on using hashes here.

File details

Details for the file gemini_flux-1.0.0-py3-none-any.whl.

File metadata

Download URL: gemini_flux-1.0.0-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 15.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for gemini_flux-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7f8e87436641cccfa7a95eb5438586a9a4808d49f8b822e2a8acd6e708378b48`
MD5	`4dcae72e5e9e6ee23da0002f56fa828e`
BLAKE2b-256	`b83b592c9eebc91e4e574999678a1d3e71c0765eae785397199002e7719ee59b`

See more details on using hashes here.

gemini-flux 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Option 2: Docker Microservice

Option 3: Kaggle Notebook

Setup — Getting Your API Keys

Step 1: Create Google Cloud Projects

Step 2: Get API Keys

Step 3: Add to .env

Usage

Minimum input (just works):

Full control:

Response always includes:

Runtime Controls

Mode options:

HTTP API (Docker Mode)

Example:

Console Output

Docker Environment Variables

Project Structure

Security

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes