OS-inspired scheduler for concurrent LLM coding agents. Transparent API proxy with admission control, rate limit awareness, AIMD backpressure, token budgets, and priority scheduling.

These details have not been verified by PyPI

Project links

Project description

HiveMind

OS-inspired scheduler for concurrent LLM coding agents.

When you spawn 10 agents, they shouldn't all stampede the API at once. HiveMind sits between the agents and the LLM provider as a transparent HTTP proxy, managing concurrency, rate limits, priority, and resource allocation — the way an OS kernel manages processes competing for CPU.

Quickstart

# Install
pip install hivemind-scheduler

# Start the proxy (auto-detects provider from URL)
hivemind proxy                                          # Anthropic (default)
hivemind proxy --upstream https://api.openai.com        # OpenAI

# In another terminal, point your agents at it
ANTHROPIC_BASE_URL=http://127.0.0.1:8765 claude code   # Claude Code
OPENAI_BASE_URL=http://127.0.0.1:8765/v1 cursor        # Cursor / Copilot / Codex

That's it. Your agents now go through HiveMind. Zero code changes.

The Problem

11 parallel agents, one API key. 3 died from ECONNRESET/502 — classic connection exhaustion. The surviving 8 worked fine. If they'd been staggered by 5 seconds, all 11 would have succeeded.

The problem isn't capacity — it's coordination.

How It Works

Agent → http://localhost:8765/v1/... → HiveMind Proxy → Anthropic / OpenAI / Ollama / Azure
                                            ↑
                                Admission control (condition variable)
                                Rate limit tracking (provider-aware)
                                AIMD backpressure + circuit breaker
                                Token counting (budget enforcement)
                                Provider-specific retry (429/502/529)
                                SSE streaming pass-through

Agents don't know HiveMind exists. They make normal API calls. HiveMind sits in the middle.

Results

Evaluated across 7 scenarios with 5–50 concurrent agents:

Scenario	Without HiveMind	With HiveMind
10 agents, 50 req/min	100% failure	0% failure
11 agents, realistic errors	73% failure	0% failure
20 agents, stress test	100% failure	10% failure
50 agents, extreme	100% failure	0% failure

Install

pip install hivemind-scheduler          # Core
pip install hivemind-scheduler[all]     # + tiktoken + redis

Or from source:

git clone https://github.com/jayluxferro/hivemind.git
cd hivemind
pip install -e ".[dev]"

Usage

Transparent Proxy (recommended)

# Start the proxy — auto-detects provider from URL
hivemind proxy --upstream https://api.anthropic.com
hivemind proxy --upstream https://api.openai.com
hivemind proxy --upstream http://localhost:11434  # Ollama

# Point agents at it
export ANTHROPIC_BASE_URL=http://127.0.0.1:8765
export OPENAI_BASE_URL=http://127.0.0.1:8765/v1

MCP Server

hivemind serve

IDE Integration

Generate config for your IDE/tool:

hivemind setup claude-code
hivemind setup cursor
hivemind setup windsurf
hivemind setup codex
hivemind setup copilot
hivemind setup all         # Show all configs

CLI Reference

`hivemind proxy`

Flag	Default	Description
`--host`	`127.0.0.1`	Bind address
`--port`	`8765`	Bind port
`--upstream`	`https://api.anthropic.com`	Upstream API URL (provider auto-detected)
`--max-concurrency`	`5`	Max concurrent in-flight requests
`--min-concurrency`	`1`	Floor for AIMD backpressure
`--db`	`hivemind.db`	SQLite database path
`--max-retries`	`3`	Max transparent retries on 429/502
`--retry-base-delay`	`1.0`	Base retry delay (seconds)
`--retry-max-delay`	`30.0`	Max retry delay (seconds)
`--latency-target`	auto	Latency target in ms for AIMD (auto-detected from provider)
`--aimd-increase`	auto	AIMD additive increase (auto-detected from provider)
`--aimd-decrease`	auto	AIMD multiplicative decrease (auto-detected from provider)
`--total-budget`	unlimited	Global token budget
`--agent-budget`	unlimited	Default per-agent token budget

`hivemind serve`

Flag	Default	Description
`--upstream`	`https://api.anthropic.com`	Upstream API URL
`--max-concurrency`	`5`	Max concurrent requests
`--db`	`hivemind.db`	Database path
`--total-budget`	unlimited	Global token budget
`--agent-budget`	unlimited	Default per-agent token budget
`--max-retries`	`3`	Max transparent retries
`--min-concurrency`	`1`	Floor for AIMD backpressure

MCP Tools

Tool	Description
`hm.submit`	Submit an agent task to the scheduler
`hm.batch`	Submit multiple tasks at once
`hm.status`	Check task/queue status
`hm.priority`	Adjust task priority (low/normal/high/critical)
`hm.budget`	Set/check token budgets (per-agent and global)
`hm.metrics`	Scheduler performance stats
`hm.config`	Tune scheduler parameters at runtime
`hm.setup`	Generate IDE/tool integration configs

Architecture

Five Scheduling Primitives

#	Primitive	What it does	OS Analogy
1	Admission Control	Concurrency gate — max N requests in-flight	Process scheduler
2	Rate Limit Tracking	Parse `x-ratelimit-*` headers, pause proactively	I/O scheduling
3	AIMD Backpressure	Latency-based concurrency: low → increase, high → cut	TCP congestion control
4	Token Budgets	Per-agent + global ceilings, warn at 85%, checkpoint at 100%	OOM killer
5	Priority Queue + DAG	Shortest-job-first, dependency tracking, reprioritization	Nice levels + cgroups

Provider Support

Auto-detected from upstream URL:

Provider	Rate Limit Headers	Default Concurrency	Streaming
Anthropic	Yes	5	Yes
OpenAI	Yes	10	Yes
Azure OpenAI	Yes	10	Yes
Google (Gemini)	-	8	Yes
Ollama (local)	-	2 (GPU)	Yes

Optional Features

pip install hivemind-scheduler[tokenizer]     # tiktoken for accurate token counting
pip install hivemind-scheduler[distributed]   # Redis for multi-machine coordination

Evaluation

Run benchmarks against a mock API (no real API credits needed):

python -m evaluation.run_benchmark --quick     # 5 agents, 30 seconds
python -m evaluation.run_benchmark --replay    # 11-agent original scenario
python -m evaluation.run_benchmark --ablation  # Test each primitive individually
python -m evaluation.run_benchmark             # Full suite (all scenarios)

Testing

pip install -e ".[dev]"
python -m pytest tests/ -v

182 tests covering all scheduler primitives (admission control, backpressure with circuit breaker, rate limiting with provider profiles), proxy, streaming, multi-provider integration (Anthropic + OpenAI), tokenizer, distributed backend, and MCP tools.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 18, 2026

0.1.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hivemind_scheduler-0.2.0.tar.gz (786.6 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hivemind_scheduler-0.2.0-py3-none-any.whl (59.2 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file hivemind_scheduler-0.2.0.tar.gz.

File metadata

Download URL: hivemind_scheduler-0.2.0.tar.gz
Upload date: Apr 18, 2026
Size: 786.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for hivemind_scheduler-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`9ff705dde8889c52a45a90d91a7aa7b16ea2e26bb390daeed0fe057f8004b8dd`
MD5	`4438a1681f4002745dcc23f1408a8b45`
BLAKE2b-256	`59d72ec4851745924615e46a64d14c1f8f0e42b7193335ec0b30b605662f5e8e`

See more details on using hashes here.

File details

Details for the file hivemind_scheduler-0.2.0-py3-none-any.whl.

File metadata

Download URL: hivemind_scheduler-0.2.0-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 59.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for hivemind_scheduler-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`554de9d0d2f8fc8267611151a2aeab2bc4f583e3a6fbb4f72ea90ab167090259`
MD5	`3e5af308f74e80940c3fbcab5946e5f6`
BLAKE2b-256	`93e839e650a63dfc5fa57eaf38631b9059a111e8bbb7e789bd4cdf1cd65c3084`

See more details on using hashes here.

hivemind-scheduler 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HiveMind

Quickstart

The Problem

How It Works

Results

Install

Usage

Transparent Proxy (recommended)

MCP Server

IDE Integration

CLI Reference

hivemind proxy

hivemind serve

MCP Tools

Architecture

Five Scheduling Primitives

Provider Support

Optional Features

Evaluation

Testing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`hivemind proxy`

`hivemind serve`