Skip to main content

OS-inspired scheduler for concurrent LLM coding agents. Transparent API proxy with admission control, rate limit awareness, AIMD backpressure, token budgets, and priority scheduling.

Project description

HiveMind

CI Python 3.11+ License: MIT

OS-inspired scheduler for concurrent LLM coding agents.

When you spawn 10 agents, they shouldn't all stampede the API at once. HiveMind sits between the agents and the LLM provider as a transparent HTTP proxy, managing concurrency, rate limits, priority, and resource allocation — the way an OS kernel manages processes competing for CPU.

Quickstart

# Install
pip install hivemind-scheduler

# Start the proxy
hivemind proxy

# In another terminal, run your agents through it
ANTHROPIC_BASE_URL=http://127.0.0.1:8765 claude code

That's it. Your agents now go through HiveMind. Zero code changes.

The Problem

11 parallel agents, one API key. 3 died from ECONNRESET/502 — classic connection exhaustion. The surviving 8 worked fine. If they'd been staggered by 5 seconds, all 11 would have succeeded.

The problem isn't capacity — it's coordination.

How It Works

Agent → http://localhost:8765/v1/messages → HiveMind Proxy → https://api.anthropic.com
                                                ↑
                                    Admission control (condition variable)
                                    Rate limit tracking (provider-aware)
                                    AIMD backpressure + circuit breaker
                                    Token counting (budget enforcement)
                                    Transparent retry (429/502/ECONNRESET)
                                    SSE streaming pass-through

Agents don't know HiveMind exists. They make normal API calls. HiveMind sits in the middle.

Results

Evaluated across 7 scenarios with 5–50 concurrent agents:

Scenario Without HiveMind With HiveMind
10 agents, 50 req/min 100% failure 0% failure
11 agents, realistic errors 73% failure 0% failure
20 agents, stress test 100% failure 10% failure
50 agents, extreme 100% failure 0% failure

Install

pip install hivemind-scheduler          # Core
pip install hivemind-scheduler[all]     # + tiktoken + redis

Or from source:

git clone https://github.com/jayluxferro/hivemind.git
cd hivemind
pip install -e ".[dev]"

Usage

Transparent Proxy (recommended)

# Start the proxy — auto-detects provider from URL
hivemind proxy --upstream https://api.anthropic.com
hivemind proxy --upstream https://api.openai.com/v1
hivemind proxy --upstream http://localhost:11434  # Ollama

# Point agents at it
export ANTHROPIC_BASE_URL=http://127.0.0.1:8765
export OPENAI_BASE_URL=http://127.0.0.1:8765/v1

MCP Server

hivemind serve

IDE Integration

Generate config for your IDE/tool:

hivemind setup claude-code
hivemind setup cursor
hivemind setup windsurf
hivemind setup codex
hivemind setup copilot
hivemind setup all         # Show all configs

MCP Tools

Tool Description
hm.submit Submit an agent task to the scheduler
hm.batch Submit multiple tasks at once
hm.status Check task/queue status
hm.priority Adjust task priority (low/normal/high/critical)
hm.budget Set/check token budgets (per-agent and global)
hm.metrics Scheduler performance stats
hm.config Tune scheduler parameters at runtime
hm.setup Generate IDE/tool integration configs

Architecture

Five Scheduling Primitives

# Primitive What it does OS Analogy
1 Admission Control Concurrency gate — max N requests in-flight Process scheduler
2 Rate Limit Tracking Parse x-ratelimit-* headers, pause proactively I/O scheduling
3 AIMD Backpressure Latency-based concurrency: low → increase, high → cut TCP congestion control
4 Token Budgets Per-agent + global ceilings, warn at 85%, checkpoint at 100% OOM killer
5 Priority Queue + DAG Shortest-job-first, dependency tracking, reprioritization Nice levels + cgroups

Provider Support

Auto-detected from upstream URL:

Provider Rate Limit Headers Default Concurrency Streaming
Anthropic Yes 5 Yes
OpenAI Yes 10 Yes
Azure OpenAI Yes 10 Yes
Google (Gemini) - 8 Yes
Ollama (local) - 2 (GPU) Yes

Optional Features

pip install hivemind-scheduler[tokenizer]     # tiktoken for accurate token counting
pip install hivemind-scheduler[distributed]   # Redis for multi-machine coordination

Evaluation

Run benchmarks against a mock API (no real API credits needed):

python -m evaluation.run_benchmark --quick     # 5 agents, 30 seconds
python -m evaluation.run_benchmark --replay    # 11-agent original scenario
python -m evaluation.run_benchmark --ablation  # Test each primitive individually
python -m evaluation.run_benchmark             # Full suite (all scenarios)

Testing

pip install -e ".[dev]"
python -m pytest tests/ -v

174 tests covering all scheduler primitives (admission control, backpressure with circuit breaker, rate limiting with provider profiles), proxy, streaming, providers, tokenizer, distributed backend, and MCP tools.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivemind_scheduler-0.1.0.tar.gz (782.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hivemind_scheduler-0.1.0-py3-none-any.whl (57.7 kB view details)

Uploaded Python 3

File details

Details for the file hivemind_scheduler-0.1.0.tar.gz.

File metadata

  • Download URL: hivemind_scheduler-0.1.0.tar.gz
  • Upload date:
  • Size: 782.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for hivemind_scheduler-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0b16268e762814656fb4a401603cf68aec3fef86c2d1874700cffd034669853f
MD5 308d51e225e45251481a98e96b764580
BLAKE2b-256 22a5f45db0a3c7d04723c33dbeb02c8a687f782567ce54fedfb9cc953d6dd4c1

See more details on using hashes here.

File details

Details for the file hivemind_scheduler-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for hivemind_scheduler-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ead773e91de7d7e35f17c324be11e7aedd875b37796ad1fb2943a52cca0af863
MD5 d9fd9bad350b1e8cc8961ce9d1c42908
BLAKE2b-256 0f52caaa335c66f34a599d197a1dbd9d2b80e683f0b681aacfd7f1a2a159d030

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page