Multi-node AI orchestration platform with tool use, agent routing, and cluster simulation.
Project description
Turnstone
Multi-node AI orchestration platform. Deploy tool-using AI agents across a cluster of servers, driven by message queues or interactive interfaces.
Named after the Ruddy Turnstone — a bird that flips rocks to expose what's hiding underneath.
What it does
Turnstone gives LLMs tools — shell, files, search, web, planning — and orchestrates multi-turn conversations where the model investigates, acts, and reports. Native deferred tool loading for Anthropic and OpenAI APIs reduces token overhead and improves tool selection accuracy when MCP servers expose many tools; local models (vLLM, llama.cpp) get a transparent client-side BM25 fallback. It runs as:
- Interactive sessions — terminal CLI or browser UI with parallel workstreams
- Queue-driven agents — trigger workstreams via message queue, stream progress, approve or auto-approve tool use
- Multi-node clusters — generic work load-balances across nodes, directed work routes to a specific server
- Cluster dashboard — real-time view of all nodes and workstreams, workstream creation with node targeting, reverse proxy for server UIs (only the console port needs network access)
- Cluster simulator — test the stack at scale (up to 1000 nodes) without an LLM backend
Quickstart
Interactive (terminal)
pip install turnstone
turnstone --base-url http://localhost:8000/v1
Interactive (browser)
turnstone-server --port 8080 --base-url http://localhost:8000/v1
Queue-driven (programmatic)
pip install turnstone[mq]
turnstone-bridge --server-url http://localhost:8080 --redis-host localhost
from turnstone.mq import TurnstoneClient
with TurnstoneClient() as client:
# Generic — any available node picks it up
result = client.send_and_wait("Analyze the error logs", auto_approve=True)
print(result.content)
# Directed — must run on a specific server
result = client.send_and_wait(
"Check disk I/O on this server",
target_node="server-12",
auto_approve=True,
)
Cluster dashboard
pip install turnstone[console]
turnstone-console --redis-host localhost --port 8090
Then open http://localhost:8090 for the cluster-wide dashboard. Create workstreams from the console and interact with any node's server UI through the built-in reverse proxy — no direct server port access required.
Docker
cp .env.example .env # edit LLM_BASE_URL, OPENAI_API_KEY, etc.
docker compose up # starts redis + server + bridge + console (SQLite)
For production with PostgreSQL:
# Requires POSTGRES_PASSWORD and DB_BACKEND=postgresql in .env (or exported)
docker compose --profile production up # adds PostgreSQL, uses it as database
Console dashboard at http://localhost:8090. See docs/docker.md for configuration, scaling, and profiles.
Simulator
Test the multi-node stack at scale without an LLM backend:
docker compose --profile sim up redis console sim
Or standalone:
pip install turnstone[sim]
turnstone-sim --nodes 100 --scenario steady --duration 60 --mps 10
See docs/simulator.md for scenarios, CLI reference, and metrics.
All frontends connect to any OpenAI-compatible API (vLLM, NVIDIA NIM/NGC, llama.cpp, OpenAI, etc.) or Anthropic's native Messages API, and auto-detect the model.
Architecture
Diagrams
Detailed UML diagrams are available in docs/diagrams/:
| Diagram | Description |
|---|---|
| System Context | Top-level components and external dependencies |
| Package Structure | Python modules and dependency graph |
| Core Engine Classes | SessionUI protocol, ChatSession, LLMProvider, WorkstreamManager |
| Conversation Turn | Full message lifecycle through the engine (provider-agnostic) |
| Tool Pipeline | Three-phase prepare/approve/execute |
| MQ Protocol | 9 inbound + 19 outbound message types |
| Message Routing | Multi-node routing scenarios |
| Redis Key Schema | All Redis keys, types, and TTLs |
| Workstream States | State machine transitions |
| Simulator | SimCluster, dispatchers, scenarios |
| Console Data Flow | Dashboard data collection threads |
| Deployment | Docker Compose service topology |
| SDK Architecture | Python + TypeScript client libraries |
| Storage Architecture | Pluggable database backends (SQLite + PostgreSQL) |
Multi-node routing
Each Turnstone server runs a bridge process. Bridges share a Redis instance for coordination:
| Redis Key | Purpose |
|---|---|
turnstone:inbound |
Shared work queue — generic tasks, any node |
turnstone:inbound:{node_id} |
Per-node queue — directed tasks |
turnstone:ws:{ws_id} |
Workstream ownership — auto-routes follow-ups |
turnstone:node:{node_id} |
Node heartbeat + metadata for discovery |
turnstone:events:{ws_id} |
Per-workstream event pub/sub |
turnstone:events:global |
Global event pub/sub |
turnstone:events:cluster |
Cluster-wide state changes (for turnstone-console) |
Routing rules:
- Message has
target_node→ routes to that node's queue - Message has
ws_id→ looks up owner, routes to owning node - Neither → shared queue, next available bridge picks it up
Bridges BLPOP from their per-node queue (priority) then the shared queue. Directed work always takes precedence.
Tools
15 built-in tools, 2 agent tools, plus external tools via MCP:
| Tool | Description | Auto-approved |
|---|---|---|
bash |
Execute shell commands | |
read_file |
Read file contents | yes |
write_file |
Write/create files | |
edit_file |
Fuzzy-match file editing | |
search |
Search files by name/content | yes |
math |
Sandboxed Python evaluation | |
man |
Read man pages | yes |
web_fetch |
Fetch URL content | |
web_search |
Web search (provider-native or Tavily) | |
remember |
Save persistent facts | yes |
recall |
Search memories and history | yes |
forget |
Remove a memory | yes |
notify |
Send notifications to linked channels | yes |
task |
Spawn autonomous sub-agent | |
plan |
Explore codebase, write .plan.md | |
mcp__* |
External tools from MCP servers |
When the total tool count exceeds a configurable threshold (default 20), MCP tools are automatically deferred using native defer_loading on Anthropic and OpenAI APIs, or a transparent client-side BM25 search for local models. The LLM discovers deferred tools on demand via a tool_search capability — no configuration needed beyond --tool-search auto (the default).
MCP Tool Servers
Turnstone supports the Model Context Protocol (MCP) for connecting external tool servers. MCP tools are discovered at startup, converted to OpenAI function-calling format, and merged with built-in tools. Each MCP tool is prefixed with mcp__{server}__{tool} to avoid name collisions. Tool lists stay fresh via push notifications (tools.listChanged), periodic polling for servers without push, and manual /mcp refresh.
Configure via config.toml or --mcp-config:
[mcp.servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
[mcp.servers.github.env]
GITHUB_TOKEN = "ghp_..."
Or use a standard MCP JSON config file:
turnstone --mcp-config ~/.config/turnstone/mcp.json
turnstone-server --mcp-config ~/.config/turnstone/mcp.json
Use /mcp in the REPL to list connected tools, /mcp refresh to re-fetch tool lists from servers. MCP tools require user approval by default (overridden by --skip-permissions or UI auto-approve).
Multi-Model and Multi-Provider Support
Turnstone supports multiple model backends per server instance, including different LLM providers. ChatSession delegates all API communication to pluggable LLMProvider adapters — the internal message format stays OpenAI-like, and each provider translates at the API boundary. Define named models in config.toml and select per-workstream or switch mid-session with /model <alias>.
[models.local]
base_url = "http://localhost:8000/v1"
model = "qwen3-32b"
# provider defaults to "openai" (works with vLLM, llama.cpp, etc.)
[models.claude]
provider = "anthropic"
api_key = "sk-ant-..."
model = "claude-opus-4-6"
context_window = 200000
[models.openai]
base_url = "https://api.openai.com/v1"
api_key = "sk-..."
model = "gpt-5"
context_window = 400000
[model]
default = "local" # which model to use by default
fallback = ["claude", "openai"] # try these if the primary is unreachable
agent_model = "claude" # optional: separate model for plan/task sub-agents
Supported providers: "openai" (default -- OpenAI, vLLM, llama.cpp, any OpenAI-compatible API) and "anthropic" (Anthropic Messages API, requires pip install turnstone[anthropic]).
Use /model to show available models, /model claude to switch. Workstreams created via the API accept an optional model parameter.
Configuration
All entry points read ~/.config/turnstone/config.toml. CLI flags override config values.
[api]
base_url = "http://localhost:8000/v1"
api_key = ""
tavily_key = "" # only needed for local/vLLM models without native search
[model]
name = "" # empty = auto-detect
temperature = 0.5
reasoning_effort = "medium"
default = "default" # model alias for new workstreams
fallback = [] # ordered list of fallback model aliases
agent_model = "" # model alias for plan/task sub-agents
[tools]
timeout = 30
skip_permissions = false
search = "auto" # "auto" (enable when >threshold tools), "on", "off"
search_threshold = 20 # min tools before tool search activates
search_max_results = 5 # max tools returned per search query
[server]
host = "0.0.0.0"
port = 8080
max_workstreams = 10 # auto-evicts oldest idle when full
[redis]
host = "localhost"
port = 6379
password = ""
[bridge]
server_url = "http://localhost:8080"
node_id = "" # empty = hostname_xxxx
[console]
host = "0.0.0.0"
port = 8090
url = "http://localhost:8090" # used by CLI /cluster commands
poll_interval = 10
[health]
backend_probe_interval = 30
backend_probe_timeout = 5
circuit_breaker_threshold = 5
circuit_breaker_cooldown = 60
[ratelimit]
enabled = true
requests_per_second = 10.0
burst = 20
[database]
backend = "sqlite" # "sqlite" (default) or "postgresql"
path = ".turnstone.db" # SQLite file path (relative to working directory)
# url = "postgresql+psycopg://user:pass@host:5432/turnstone" # PostgreSQL
# pool_size = 5 # PostgreSQL connection pool size
[mcp]
config_path = "" # path to MCP JSON config file (alternative to TOML sections)
refresh_interval = 14400 # periodic refresh for servers without push notifications (seconds, 0 to disable)
[mcp.servers.example] # one section per MCP server
command = "npx"
args = ["-y", "@modelcontextprotocol/server-example"]
# type = "stdio" # "stdio" (default) or "http"
# url = "" # for HTTP transport
Precedence: CLI args > environment variables > config.toml > defaults.
Workstreams
Parallel independent conversations, each with its own session and state:
| Symbol | State | Meaning |
|---|---|---|
· |
idle | Waiting for input |
◌ |
thinking | Model is generating |
▸ |
running | Tool execution in progress |
◆ |
attention | Waiting for approval |
✖ |
error | Something went wrong |
Idle workstreams are automatically cleaned up after 2 hours (configurable). In multi-node deployments, workstream ownership is tracked in Redis — follow-up messages auto-route to the owning node.
Monitoring
/metrics endpoint exposes Prometheus-format metrics:
turnstone_tokens_total{direction}— prompt/completion token countersturnstone_tool_calls_total{tool}— per-tool invocation countsturnstone_workstream_context_ratio{ws_id}— per-workstream context utilizationturnstone_http_request_duration_seconds— request latency histogramturnstone_workstreams_by_state{state}— workstream state gaugesturnstone_sse_connections_active— current open SSE connectionsturnstone_ratelimit_rejected_total— requests rejected by rate limiterturnstone_backend_up— LLM backend reachability (0/1)turnstone_circuit_state— circuit breaker state (0=closed, 1=open, 2=half_open)turnstone_workstreams_evicted_total— workstreams auto-evicted at capacity
Per-workstream metrics are labeled by ws_id (bounded to 10 max workstreams).
Health & Rate Limiting
Health degradation. A background BackendHealthMonitor probes the LLM backend every backend_probe_interval seconds. When the backend is unreachable, /health reports "status": "degraded" (HTTP 200) and the turnstone_backend_up gauge drops to 0.
Circuit breaker. After circuit_breaker_threshold consecutive probe failures the circuit opens (CLOSED -> OPEN). While open, ChatSession._create_stream_with_retry skips the backend entirely and returns an error. After circuit_breaker_cooldown seconds the circuit enters HALF_OPEN, allowing a single probe. A successful probe closes the circuit; a failure re-opens it.
Per-IP rate limiting. When [ratelimit].enabled is true, each client IP is tracked with a token-bucket limiter (requests_per_second / burst). Rate limiting is applied in do_GET/do_POST after authentication but before route dispatch. /health and /metrics are exempt. Requests that exceed the limit receive HTTP 429 with a Retry-After header.
Workstream eviction. When WorkstreamManager.create() would exceed max_workstreams, the oldest IDLE workstream is automatically evicted and the turnstone_workstreams_evicted_total counter is incremented. Configure via [server].max_workstreams (default 10).
Requirements
- Python 3.11+
- An OpenAI-compatible API endpoint (vLLM, NVIDIA NIM, llama.cpp, etc.) or an Anthropic API key
- Redis (for message queue bridge —
pip install turnstone[mq]) - Anthropic provider (optional —
pip install turnstone[anthropic]) - PostgreSQL (optional, for production —
pip install turnstone[postgres]) - Git LFS (for cloning — diagram PNGs are stored in LFS)
License
Business Source License 1.1 — free for all use except hosting as a managed service. Converts to Apache 2.0 on 2030-03-01.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turnstone-0.5.0.tar.gz.
File metadata
- Download URL: turnstone-0.5.0.tar.gz
- Upload date:
- Size: 508.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70b69abb4869a76ca8a63396e10ba38038b138c8f942741508cf03dc2dbbd089
|
|
| MD5 |
dbc5337bf0e9df87961a92f2dcc02e0e
|
|
| BLAKE2b-256 |
faa3fb225e587f12e389b32aa4cea9dd335537f161729eccf283b6713415c981
|
Provenance
The following attestation bundles were made for turnstone-0.5.0.tar.gz:
Publisher:
publish.yml on turnstonelabs/turnstone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turnstone-0.5.0.tar.gz -
Subject digest:
70b69abb4869a76ca8a63396e10ba38038b138c8f942741508cf03dc2dbbd089 - Sigstore transparency entry: 1059663740
- Sigstore integration time:
-
Permalink:
turnstonelabs/turnstone@136b75fdef79163c93437cdaadd23da1f6c093e0 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/turnstonelabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@136b75fdef79163c93437cdaadd23da1f6c093e0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file turnstone-0.5.0-py3-none-any.whl.
File metadata
- Download URL: turnstone-0.5.0-py3-none-any.whl
- Upload date:
- Size: 328.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f324282ef3b60b0881369774f862ee1a5e7b34916ceaa6b9d198170c870becd5
|
|
| MD5 |
f08f6f494762dadaaa5feda1a1195c20
|
|
| BLAKE2b-256 |
444f92beba16b8c9f00da14a21bc2cf14ff1df404887209babc173a31f1778f3
|
Provenance
The following attestation bundles were made for turnstone-0.5.0-py3-none-any.whl:
Publisher:
publish.yml on turnstonelabs/turnstone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turnstone-0.5.0-py3-none-any.whl -
Subject digest:
f324282ef3b60b0881369774f862ee1a5e7b34916ceaa6b9d198170c870becd5 - Sigstore transparency entry: 1059663741
- Sigstore integration time:
-
Permalink:
turnstonelabs/turnstone@136b75fdef79163c93437cdaadd23da1f6c093e0 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/turnstonelabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@136b75fdef79163c93437cdaadd23da1f6c093e0 -
Trigger Event:
push
-
Statement type: