A powerful AI agentic system

These details have not been verified by PyPI

Project description

Captain Claw

An open-source AI agent with multi-agent orchestration, autonomous cognitive systems, and a full management dashboard. Runs locally, supports every major LLM provider, and ships with 44 built-in tools.

What's New in 0.4.28

WhatsApp PA & Intentions. Captain Claw 0.4.28 turns WhatsApp into a real two-way personal-assistant channel, adds a brand-new Intentions primitive (proactive, permissioned future actions), and ships a proactive scheduler, glasses dashboard, and face recognition — plus a wave of agent-reliability fixes.

WhatsApp bridge — two-way PA (captain_claw/flight_deck/whatsapp_bridge.py): inbound text, voice notes (Soniox STT), location & contacts; outbound text, optional voice replies (Soniox TTS), and now document sending (whatsapp_send_file tool — send a saved file to the current chat or any number, with a robust MIME map for pptx/docx/xlsx/pdf/text). Allowlist-gated, with /c /mute slash commands.
Caption-routed inbound images — a photo is no longer force-fed to face recognition. Its caption routes it: "who is this?" → face recognition, "summarise this" → the agent's vision, "remember this is Alice" → face enrollment. A bare photo asks what to do. Face recognition stays entirely on Flight Deck.
Intentions (captain_claw/intentions.py, intentions tool, Flight Deck panel) — a control-plane layer between noticing (insights) and doing (cron). User intentions are notes-to-self resurfaced in context; agent intentions are proactive actions the agent announces (low-risk) or asks permission for (anything that sends/changes data). A channel-agnostic decision bus resolves them by WhatsApp reply or Flight Deck button; approving a repeatable one materialises a scheduler job; declining writes a negative-feedback insight so it won't re-propose. An opt-in Phase 3 generator proactively proposes intentions from your recent activity (cooldown + quiet-hours + per-day cap + proactivity dial).
Flight Deck scheduler (captain_claw/flight_deck/fd_scheduler.py) — recurring/one-shot jobs that run an agent turn and push the result to WhatsApp / glasses / Telegram, with quiet-hours support.
Glasses dashboard & face recognition — multi-face recognition with enrollment, plus a Flight Deck file-preview dashboard.
Fleet collaboration — the flight_deck tool (list / consult / delegate / spawn peers) is always offered so an agent can reliably reach other agents ("ask deepseek what's new").
Reliable tool availability in Eco mode — Google (Gmail/Drive/Calendar), WhatsApp, intentions, and the fleet tool are now always offered, fixing cases where Eco mode silently hid them.
Reliability — agents on thinking-mode models (e.g. DeepSeek thinking) no longer crash when a forced tool_choice is rejected; the call transparently retries without it.

Backward compatible — existing 0.4.27 setups keep working unchanged. The WhatsApp bridge, scheduler, and Intentions generator are all opt-in. See RELEASE_NOTES_0.4.28.md for the full breakdown and walkthroughs.

See RELEASE_NOTES.md for the full changelog.

What Makes Captain Claw Different

Flight Deck — Multi-Agent Command Center

A full management dashboard for running teams of AI agents. Spawn, monitor, configure, and coordinate agents from a single UI.

captain-claw-fd    # http://0.0.0.0:25080

Agent Forge — Describe a business goal in plain text. An LLM designs a specialized team with roles, tools, operating procedures, and a lead coordinator. Review, customize, and spawn the entire team in one click.
Agent Council — Structured multi-agent deliberation. Run brainstorms, debates, reviews, or planning sessions with 2-N agents. Each agent self-scores suitability, chooses actions (answer, challenge, refine, broaden), and responds in moderated rounds. A moderator synthesizes conclusions; all agents vote. Export as markdown minutes.
Fleet Communication — Agents discover peers automatically. Consult (synchronous ask) or delegate (asynchronous queue) tasks to specialist agents. Shared workspace and file transfer across the fleet.
Director Panel — Unified overview of all agents. Broadcast messages fleet-wide. Per-agent token/cost analytics, trace timelines, datastore browser, file browser, config editor.
Multi-user Auth — JWT authentication, admin dashboard, rate limiting, and quotas.
MCP Connections — Add Model Context Protocol servers (HTTP or stdio) once and every entitled agent in the fleet picks up their tools — no per-agent config. Phase 2 adds stdio transport for npx/uvx-shipped servers, per-agent allowlists, hot tool-list reload over SSE, and streaming tool calls.

Cognitive Architecture

Captain Claw has a five-layer memory system and autonomous cognitive processes that run without user intervention.

Memory Layers:

Layer	What it stores	How it's used
Working Memory	Current conversation in the LLM context window	Immediate reasoning
Semantic Memory	Hybrid vector + BM25 full-text search over documents and sessions	Auto-injected when relevant to the current query
Deep Memory	Typesense-backed long-term archive, scales to millions of documents	Searched on demand for deep recall
Insights	Auto-extracted facts, contacts, decisions, and deadlines (SQLite + FTS5)	Cross-session knowledge injected into system prompt
Nervous System	Autonomous "intuitions" — patterns, hypotheses, and connections	Surfaces non-obvious findings the agent wouldn't otherwise notice

Autonomous Processes:

Dreaming — Background dream cycles cross-reference all memory layers to synthesize intuitions. Runs after every N messages and during idle hours. Intuitions have confidence scores that decay over time unless validated.
Tension Tracking — Holds unresolved contradictions (like musical dissonance) rather than forcing premature resolution. Tensions persist until evidence resolves them.
Maturation Pipeline — New intuitions sit through multiple dream cycles before being surfaced to the agent, reducing noise.
Cognitive Tempo — Detects whether the user is in deep contemplative mode or rapid task execution, and adapts processing depth accordingly (adagio / moderato / allegro).
Cognitive Modes — Seven tunable behavioral profiles (Ionian through Locrian, inspired by musical scales) that shift the agent between analytical, creative, cautious, and exploratory approaches.
Self-Reflection — Periodic self-assessment that reviews conversations, memory, and completed tasks to generate improvement directives injected into the system prompt.
Insights Extraction — Automatically identifies durable knowledge from conversations — deduplicates, categorizes, and stores for future context injection.

Visualization:

Brain Graph — Interactive 3D force-directed graph of the entire cognitive topology. Insights, intuitions, tasks, contacts, and sessions rendered as typed nodes with provenance edges. Live WebSocket updates.
Process of Thoughts — Full lineage tracking across all cognitive subsystems. Every message, insight, intuition, and task is connected via provenance IDs, forming a traversable thought graph.

Orchestrator / DAG Mode

Decompose complex tasks into a dependency graph and execute subtasks in parallel across separate agent sessions.

/orchestrate Research startups in 3 countries, analyze founders, create comparison spreadsheet

LLM decomposes the prompt into a task DAG with dependencies
Parallel execution with configurable worker count
Shared workspace for inter-task data flow
Structured output validation (JSON Schema with auto-retry)
Real-time trace timeline (Gantt-style visualization)
Headless CLI mode for cron/scripts: captain-claw-orchestrate

BotPort — Agent-to-Agent Network

Connect multiple Captain Claw instances through a routing hub. Agents delegate tasks to specialists based on expertise tags, persona matching, or LLM-powered routing.

BotPort Swarm — DAG-based multi-agent orchestration across networked instances. Approval gates, retry with fallback, checkpointing, inter-agent file transfer (up to 50 MB), cron scheduling, and a visual dashboard.

MCP Server (act as an MCP server)

Captain Claw runs as a Model Context Protocol server over stdio — Claude Desktop and other MCP clients can browse sessions, read conversation history, and send prompts to the full agent.

captain-claw-mcp    # stdio, configure in claude_desktop_config.json

MCP Client (consume MCP servers via Flight Deck)

The other direction: agents in your fleet call into MCP servers. Add a server once in Flight Deck → Connections → MCP servers and every agent the allowlist permits gets the tools auto-registered on boot.

HTTP transport — Streamable-HTTP MCP servers, with optional OAuth2 client_credentials, captured Mcp-Session-Id, and SSE-response parsing.
stdio transport — command + args + env for local MCP servers shipped via npx / uvx (filesystem, sqlite, github, postgres, etc.). Children are spawned lazily, auto-respawned on death, and torn down with SIGTERM/SIGKILL on close.
Per-agent allowlists — Restrict each server to specific agent slugs. Disallowed agents get HTTP 404 (existence is opaque).
Hot reload — Agents subscribe to /fd/mcp/agent/events (SSE) and re-register proxy tools the moment you change a server — no restart needed.
Streaming calls — POST /fd/mcp/<name>/call_stream emits progress / result / error SSE frames for UIs that want live indicators while a long-running tool runs.

See USAGE.md → Flight Deck → Connections → MCP servers for the full endpoint reference and config schema.

Safety Guards

Three layers of protection that run before, during, and after agent operations:

Input guards — Validate user intent before the LLM sees it
Script guards — AST-level analysis of generated code before execution
Output guards — Validate tool results for hallucinations and safety

Guards support two modes: stop_suspicious (block automatically) or ask_for_approval (prompt the user).

Multi-Model Support

Mix providers freely — each session independently selects its model.

Provider	Models
OpenAI (API key)	GPT-5.4, GPT-5.4-mini, GPT-5.4-nano, o3, o4-mini, gpt-image-1.5
OpenAI (Sign in with ChatGPT)	`gpt-5`, `gpt-5-codex`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`, `gpt-5.1-codex-max`, `gpt-5.2-codex`, `gpt-5.3-codex` — billed against your ChatGPT plan, no API key
Anthropic	Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 (with prompt caching)
Google	Gemini 3.1 Pro/Flash, Gemini 2.5 Pro/Flash (API key or OAuth/Vertex)
Ollama	Any local model
LiteRT (on-device)	`.litertlm` Gemma models running locally via an isolated subprocess worker
OpenRouter	200+ models via meta-router

Quick Start

pip install captain-claw
export OPENAI_API_KEY="sk-..."          # or ANTHROPIC_API_KEY, GEMINI_API_KEY, etc.
captain-claw-web                         # http://127.0.0.1:23080

captain-claw-web          # Web UI (default)
captain-claw              # Interactive terminal
captain-claw --tui        # Terminal UI
captain-claw-fd           # Flight Deck multi-agent dashboard
captain-claw-mcp          # MCP server for Claude Desktop
botport                   # Agent-to-agent routing hub

First run starts onboarding automatically. For Ollama, no key needed — set provider: ollama in config.yaml.

46 Built-in Tools

Shell, file I/O, web fetch/search, browser automation, PDF/DOCX/XLSX/PPTX extraction, image generation (DALL-E), OCR, vision, TTS, STT, email (SMTP/Mailgun/SendGrid), Google Workspace (Drive, Docs, Sheets, Slides, Gmail, Calendar), WhatsApp file delivery, intentions (proactive future actions), desktop automation, screen capture with voice commands, persistent cross-session memory (todos, contacts, scripts, APIs, playbooks), datastore (SQLite tables with protection rules), deep memory (Typesense), personality system, cron scheduling, BotPort fleet discovery, and Termux (Android).

See USAGE.md for the full reference.

Web UI

Chat, Computer (retro-themed research workspace with 14 themes), monitor pane, instruction editor, command palette, persona selector, datastore browser, deep memory dashboard, insights browser, nervous system browser, Brain Graph 3D visualization, reflections dashboard, personality editor, playbook editor, and LLM usage analytics.

Computer — A standalone research workspace at /computer with themed visual generation, exploration trees, folder browser (local + Google Drive), file attachments, PDF export, and public mode with BYOK (Bring Your Own Key).

Docker

docker pull kstevica/captain-claw:latest
docker run -d -p 23080:23080 \
  -v $(pwd)/config.yaml:/app/config.yaml:ro \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/docker-data/home-config:/root/.captain-claw \
  -v $(pwd)/docker-data/workspace:/data/workspace \
  kstevica/captain-claw:latest

See README_DETAILED.md for Docker Compose and persistent data setup.

Configuration

YAML-driven with environment variable overrides (CLAW_ prefix).

model:
  provider: gemini
  model: gemini-2.5-flash
  allowed:
    - id: claude-sonnet
      provider: anthropic
      model: claude-sonnet-4-20250514
    - id: gpt-4o
      provider: openai
      model: gpt-4o

web:
  enabled: true
  port: 23080

Load precedence: ./config.yaml > ~/.captain-claw/config.yaml > env vars > .env > defaults.

Full reference: USAGE.md (23 config sections).

Architecture

Component	Path
Agent (14-mixin composition)	`captain_claw/agent.py`
LLM providers	`captain_claw/llm/`
44 tools + registry	`captain_claw/tools/`
Flight Deck (FastAPI + React)	`captain_claw/flight_deck/`
DAG orchestrator	`captain_claw/session_orchestrator.py`
Semantic memory (vector + BM25)	`captain_claw/semantic_memory.py`
Deep memory (Typesense)	`captain_claw/deep_memory.py`
Insights (fact extraction)	`captain_claw/insights.py`
Nervous system (dreaming)	`captain_claw/nervous_system.py`
Cognitive tempo	`captain_claw/cognitive_tempo.py`
MCP server	`captain_claw/mcp_serve.py`
BotPort client	`captain_claw/botport_client.py`
Web UI + REST API	`captain_claw/web/`
Prompt templates (~100 files)	`captain_claw/instructions/`
Config (Pydantic)	`captain_claw/config.py`

Documentation

USAGE.md — Complete reference for all commands, tools, config, and features
README_DETAILED.md — Extended README with feature-by-feature breakdown

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.5

Jun 29, 2026

0.6.4

Jun 28, 2026

0.6.3

Jun 26, 2026

0.6.2

Jun 25, 2026

0.6.1

Jun 24, 2026

0.6.0

Jun 16, 2026

0.5.6

Jun 14, 2026

0.5.5

Jun 11, 2026

0.5.4

Jun 10, 2026

0.5.3.1

Jun 9, 2026

0.5.3

Jun 9, 2026

0.5.2

Jun 7, 2026

0.5.1

Jun 6, 2026

This version

0.4.28

Jun 2, 2026

0.4.26

May 19, 2026

0.4.25

May 8, 2026

0.4.24

May 6, 2026

0.4.22

Apr 28, 2026

0.4.21

Apr 9, 2026

0.4.20

Apr 8, 2026

0.4.19

Apr 7, 2026

0.4.18

Apr 7, 2026

0.4.17

Apr 6, 2026

0.4.16

Apr 5, 2026

0.4.15

Apr 5, 2026

0.4.14

Apr 4, 2026

0.4.13

Apr 4, 2026

0.4.12.5

Apr 3, 2026

0.4.12

Apr 3, 2026

0.4.11

Apr 3, 2026

0.4.10

Apr 3, 2026

0.4.9.3

Apr 2, 2026

0.4.9.2

Apr 2, 2026

0.4.9.1

Apr 2, 2026

0.4.9

Apr 2, 2026

0.4.8.1

Apr 2, 2026

0.4.8

Mar 31, 2026

0.4.7

Mar 29, 2026

0.4.6

Mar 27, 2026

0.4.5

Mar 22, 2026

0.4.4

Mar 21, 2026

0.4.3

Mar 19, 2026

0.4.2

Mar 16, 2026

0.4.1

Mar 16, 2026

0.4.0.1

Mar 15, 2026

0.4.0

Mar 15, 2026

0.3.5

Mar 13, 2026

0.3.4.1

Mar 12, 2026

0.3.4

Mar 12, 2026

0.3.3.7

Mar 8, 2026

0.3.3.6

Mar 8, 2026

0.3.3.5

Mar 7, 2026

0.3.3.4

Mar 6, 2026

0.3.3.3

Mar 6, 2026

0.3.3.2

Mar 5, 2026

0.3.3.1

Mar 5, 2026

0.3.3

Mar 4, 2026

0.3.2

Mar 3, 2026

0.3.1.9

Mar 3, 2026

0.3.1.8

Mar 3, 2026

0.3.1.7

Mar 3, 2026

0.3.1.6

Mar 3, 2026

0.3.1.5

Mar 3, 2026

0.3.1.4

Mar 3, 2026

0.3.1.2

Mar 2, 2026

0.3.1.1

Mar 2, 2026

0.3.1

Mar 2, 2026

0.3.0.4

Mar 2, 2026

0.3.0.3

Mar 2, 2026

0.3.0.2

Mar 2, 2026

0.3.0.1

Mar 2, 2026

0.3.0

Mar 1, 2026

0.2.7.1

Mar 1, 2026

0.2.7

Mar 1, 2026

0.2.6.3.6

Mar 1, 2026

0.2.6.3.5

Mar 1, 2026

0.2.6.3.4

Mar 1, 2026

0.2.6.3.3

Mar 1, 2026

0.2.6.3.2

Mar 1, 2026

0.2.6.3.1

Mar 1, 2026

0.2.6.3

Feb 28, 2026

0.2.6.2

Feb 28, 2026

0.2.6.1

Feb 28, 2026

0.2.6

Feb 28, 2026

0.2.5

Feb 28, 2026

0.2.1

Feb 28, 2026

0.2.0

Feb 28, 2026

0.1.0

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

captain_claw-0.4.28.tar.gz (4.5 MB view details)

Uploaded Jun 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

captain_claw-0.4.28-py3-none-any.whl (4.7 MB view details)

Uploaded Jun 2, 2026 Python 3

File details

Details for the file captain_claw-0.4.28.tar.gz.

File metadata

Download URL: captain_claw-0.4.28.tar.gz
Upload date: Jun 2, 2026
Size: 4.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for captain_claw-0.4.28.tar.gz
Algorithm	Hash digest
SHA256	`c2b4b6c638ce3b201133cebe81d29674bdd5eed8db629d5394b07013f4a2b790`
MD5	`36cdcca3247e8abe7a018532d8b4ec2f`
BLAKE2b-256	`7d3298d700cc6e313451e312e66536546e00bad0287de5728a77720bb9fd8ad7`

See more details on using hashes here.

File details

Details for the file captain_claw-0.4.28-py3-none-any.whl.

File metadata

Download URL: captain_claw-0.4.28-py3-none-any.whl
Upload date: Jun 2, 2026
Size: 4.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for captain_claw-0.4.28-py3-none-any.whl
Algorithm	Hash digest
SHA256	`905e05b5b5a4c12100a34a6d77ab0b73f44cfcd07629bf5de23aa22fbb41eba0`
MD5	`7c0b7dbe6a16e95e9f23682f21bb6b2e`
BLAKE2b-256	`412931e6a76c4fb864b6ced494c0fd6274a4c880b30ef22035a8bd9892435e72`

See more details on using hashes here.

captain-claw 0.4.28

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Captain Claw

What's New in 0.4.28

What Makes Captain Claw Different

Flight Deck — Multi-Agent Command Center

Cognitive Architecture

Orchestrator / DAG Mode

BotPort — Agent-to-Agent Network

MCP Server (act as an MCP server)

MCP Client (consume MCP servers via Flight Deck)

Safety Guards

Multi-Model Support

Quick Start

46 Built-in Tools

Web UI

Docker

Configuration

Architecture

Documentation

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes