OS-inspired scheduler for concurrent LLM coding agents. Transparent API proxy with admission control, rate limit awareness, AIMD backpressure, token budgets, and priority scheduling.
Project description
HiveMind
OS-inspired scheduler for concurrent LLM coding agents.
When you spawn 10 agents, they shouldn't all stampede the API at once. HiveMind sits between the agents and the LLM provider as a transparent HTTP proxy, managing concurrency, rate limits, priority, and resource allocation — the way an OS kernel manages processes competing for CPU.
Quickstart
# Install
pip install hivemind-scheduler
# Start the proxy
hivemind proxy
# In another terminal, run your agents through it
ANTHROPIC_BASE_URL=http://127.0.0.1:8765 claude code
That's it. Your agents now go through HiveMind. Zero code changes.
The Problem
11 parallel agents, one API key. 3 died from ECONNRESET/502 — classic connection exhaustion. The surviving 8 worked fine. If they'd been staggered by 5 seconds, all 11 would have succeeded.
The problem isn't capacity — it's coordination.
How It Works
Agent → http://localhost:8765/v1/messages → HiveMind Proxy → https://api.anthropic.com
↑
Admission control (condition variable)
Rate limit tracking (provider-aware)
AIMD backpressure + circuit breaker
Token counting (budget enforcement)
Transparent retry (429/502/ECONNRESET)
SSE streaming pass-through
Agents don't know HiveMind exists. They make normal API calls. HiveMind sits in the middle.
Results
Evaluated across 7 scenarios with 5–50 concurrent agents:
| Scenario | Without HiveMind | With HiveMind |
|---|---|---|
| 10 agents, 50 req/min | 100% failure | 0% failure |
| 11 agents, realistic errors | 73% failure | 0% failure |
| 20 agents, stress test | 100% failure | 10% failure |
| 50 agents, extreme | 100% failure | 0% failure |
Install
pip install hivemind-scheduler # Core
pip install hivemind-scheduler[all] # + tiktoken + redis
Or from source:
git clone https://github.com/jayluxferro/hivemind.git
cd hivemind
pip install -e ".[dev]"
Usage
Transparent Proxy (recommended)
# Start the proxy — auto-detects provider from URL
hivemind proxy --upstream https://api.anthropic.com
hivemind proxy --upstream https://api.openai.com/v1
hivemind proxy --upstream http://localhost:11434 # Ollama
# Point agents at it
export ANTHROPIC_BASE_URL=http://127.0.0.1:8765
export OPENAI_BASE_URL=http://127.0.0.1:8765/v1
MCP Server
hivemind serve
IDE Integration
Generate config for your IDE/tool:
hivemind setup claude-code
hivemind setup cursor
hivemind setup windsurf
hivemind setup codex
hivemind setup copilot
hivemind setup all # Show all configs
MCP Tools
| Tool | Description |
|---|---|
hm.submit |
Submit an agent task to the scheduler |
hm.batch |
Submit multiple tasks at once |
hm.status |
Check task/queue status |
hm.priority |
Adjust task priority (low/normal/high/critical) |
hm.budget |
Set/check token budgets (per-agent and global) |
hm.metrics |
Scheduler performance stats |
hm.config |
Tune scheduler parameters at runtime |
hm.setup |
Generate IDE/tool integration configs |
Architecture
Five Scheduling Primitives
| # | Primitive | What it does | OS Analogy |
|---|---|---|---|
| 1 | Admission Control | Concurrency gate — max N requests in-flight | Process scheduler |
| 2 | Rate Limit Tracking | Parse x-ratelimit-* headers, pause proactively |
I/O scheduling |
| 3 | AIMD Backpressure | Latency-based concurrency: low → increase, high → cut | TCP congestion control |
| 4 | Token Budgets | Per-agent + global ceilings, warn at 85%, checkpoint at 100% | OOM killer |
| 5 | Priority Queue + DAG | Shortest-job-first, dependency tracking, reprioritization | Nice levels + cgroups |
Provider Support
Auto-detected from upstream URL:
| Provider | Rate Limit Headers | Default Concurrency | Streaming |
|---|---|---|---|
| Anthropic | Yes | 5 | Yes |
| OpenAI | Yes | 10 | Yes |
| Azure OpenAI | Yes | 10 | Yes |
| Google (Gemini) | - | 8 | Yes |
| Ollama (local) | - | 2 (GPU) | Yes |
Optional Features
pip install hivemind-scheduler[tokenizer] # tiktoken for accurate token counting
pip install hivemind-scheduler[distributed] # Redis for multi-machine coordination
Evaluation
Run benchmarks against a mock API (no real API credits needed):
python -m evaluation.run_benchmark --quick # 5 agents, 30 seconds
python -m evaluation.run_benchmark --replay # 11-agent original scenario
python -m evaluation.run_benchmark --ablation # Test each primitive individually
python -m evaluation.run_benchmark # Full suite (all scenarios)
Testing
pip install -e ".[dev]"
python -m pytest tests/ -v
174 tests covering all scheduler primitives (admission control, backpressure with circuit breaker, rate limiting with provider profiles), proxy, streaming, providers, tokenizer, distributed backend, and MCP tools.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hivemind_scheduler-0.1.0.tar.gz.
File metadata
- Download URL: hivemind_scheduler-0.1.0.tar.gz
- Upload date:
- Size: 782.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b16268e762814656fb4a401603cf68aec3fef86c2d1874700cffd034669853f
|
|
| MD5 |
308d51e225e45251481a98e96b764580
|
|
| BLAKE2b-256 |
22a5f45db0a3c7d04723c33dbeb02c8a687f782567ce54fedfb9cc953d6dd4c1
|
File details
Details for the file hivemind_scheduler-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hivemind_scheduler-0.1.0-py3-none-any.whl
- Upload date:
- Size: 57.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ead773e91de7d7e35f17c324be11e7aedd875b37796ad1fb2943a52cca0af863
|
|
| MD5 |
d9fd9bad350b1e8cc8961ce9d1c42908
|
|
| BLAKE2b-256 |
0f52caaa335c66f34a599d197a1dbd9d2b80e683f0b681aacfd7f1a2a159d030
|