AI agent brain with memory, teams, flows, document ingestion, and MCP — your agent, but better every day
Project description
AIBrain — Self-compounding agent substrate with persistent memory and multi-agent fleet
79% recall on LongMemEval M with a 109M model. 99.8% on MSDialog. Zero-parameter FTS5 achieves 96.9% NDCG@5 on dialogue retrieval. All on a consumer laptop, no GPU required. One install. 100 workflows. Agent teams. Flow engine. Document ingestion. Universal MCP. Dual-system memory that compounds across sessions. Runs locally, no cloud lock-in.
AIBrain is a self-hosted memory substrate and fleet orchestration layer for AI agents. It gives any agent persistent memory, typed task composition, and inter-agent communication — designed to install alongside agents like Hermes, not replace them. The architectural bet: compounding comes from a fleet of cooperating agents, not from one agent alone.
Shipped: Dual-system memory (CLS), 100 workflows, multi-model routing, 68-page dashboard, MCP server, inter-agent comms backbone (CloudEvents pub/sub), agent spawn infrastructure. Specced, filed, targeted v1.7-v1.9: Cross-agent handoff verification loop, agent skill marketplace integration, fleet compounding demonstrations. Design target v2.0: Self-compounding fleet.
AIBrain is a self-hosted operating system for AI agents. It gives any agent persistent memory, typed Agent/Task/Team composition, a decorator-driven Flow engine, document ingestion, universal MCP client connectivity, a reactive workflow engine, a Complementary Learning Systems (CLS) cognitive substrate with a weekly consolidation cycle, multi-model LLM routing, an approval queue, inter-agent messaging, and 100 ready-to-run workflows — all behind a 68-page Next.js dashboard. Deploy it on a laptop, a VPS, or in Docker; your agent carries its entire brain with it.
Why AIBrain?
Most AI memory systems are toys. They store everything, retrieve nothing useful, and require expensive GPUs to run. AIBrain is different:
- Verified retrieval performance. On LongMemEval M (500 instances, the standard benchmark for long-term conversational memory), AIBrain's SelRoute system achieves Ra@5 = 0.79 with a 109M bge-base model (0.7915 exact, honest round) — beating the strongest published baseline (Contriever + LLM fact keys, 0.762) by +0.029 on recall and +0.180 on NDCG@5. A 22MB MiniLM model achieves Ra@5 = 0.785, statistically equivalent to models 50% larger. The zero-parameter FTS5 baseline (zero trainable parameters, zero GPU) achieves NDCG@5 = 0.692 on LongMemEval M, exceeding every published system including 1.5B-parameter models.
- Near-perfect on domain-specific retrieval. On MSDialog (2,199 tech-support dialogues), AIBrain achieves Ra@5 = 0.998 with a 22MB MiniLM model — near-perfect retrieval for technical support contexts.
- Zero-parameter dialogue retrieval. On LMEB dialogue (840 instances), the FTS5 zero-ML retriever achieves NDCG@5 = 0.971 — no neural parameters, no GPU, no training data.
- Total evaluation instances: 62,792+. Every number is from verified JSON files in the benchmarks/ directory. The methodology is described in the peer-reviewed SelRoute paper (McKee, 2026, arXiv:2604.02431).
- All benchmarks run on a consumer laptop. No GPU required. No cloud credits. No special hardware.
The secret is the CLS architecture: a Complementary Learning Systems dual-system memory inspired by the mammalian brain. Every session writes to fast hippocampal memory. A weekly aibrain dream consolidation cycle slow-extracts patterns and upgrades routing weights. The brain gets measurably better at subsequent tasks — not just stores more.
What's New in v1.8.8
- Memory recall ranking - precision restored and hardened. The semantic retrieval channel was mis-ranking known-item queries because the salience re-rank was a no-op on the hybrid/RRF routes. Fixed by fusing the FTS/vector scores through RRF and applying exact distinctive-term overlap as the decisive signal for keyword/known-item queries. Measured on the recall eval (n=20, seed=42): MRR@5 0.71 -> 0.875, hit@1 0.55 -> 0.85. A permanent regression test now guards the floor.
- Run Claude Code on an alternate backend. New bundled launchers (
aibrain/integrations/claude_code_byob/) run the Claude Code CLI against DeepSeek or MiniMax over their Anthropic-compatible endpoints, with every model slot pinned (including the background small/fast model). Also released standalone at github.com/sindecker/claude-code-byob (MIT).
Previously in v1.8.7:
- Tool timeout lifted 30s -> 1800s (env-configurable via
AIBRAIN_HANDS_TOOL_TIMEOUT) - removes the single biggest source of mid-task drop-outs. - DB write-lock relief - high-write tables split onto a separate
fleet_comms.db; memory-consolidation clustering rewritten for large (50K+ memory) brains. - Liaison review falls back to DeepSeek when the primary reviewer errors; goal fan-out orchestrator added; self-imposed DeepSeek output cap removed.
- New:
GET /api/memory/list, anaibrain doctordiagnostic verb, and a Tauri 2.x native desktop scaffold (M0).
What's New in v1.8.5
- Lucy CLI: paste-fragmentation root cause fix.
prompt_toolkit 3.0.52silently rejected theenable_bracketed_pastekwarg asTypeError— swallowed by a bare exception, hiding paste-corruption-during-streaming for months. Removed the kwarg; the bracketed-paste keybinding handles paste detection on its own. Typing while Lucy is streaming a reply no longer corrupts the input buffer. - Captcha solver layer + default-allow policy. New
aibrain/lucy/captcha/module with stealth-first detection (recaptcha v2/v3, Turnstile, hCaptcha) and pluggable solvers (CDP attach → CapSolver REST fallback → manual-Telegram refuse). Policy default: solve on any host that hits a captcha. Hard-deny mechanism is preserved as opt-in but empty by default — LinkedIn / Indeed / Reddit job-search flows just work. /help,/version,/exitcommands./helplists all 21 wired commands grouped by category./versionshowsaibrain.__version__, current git HEAD, and the loaded module path (catches stale-daemon issues fast)./exitprints a one-line session summary on quit.- Session save / load / memory-recall failures now surface visibly. Three formerly-silent
except Exception: passswallows were replaced with logged warnings at_log_toollevel 1. /switch <provider>works in non-TTY contexts. Tests, scripts, and autonomous/goalruns no longer hang on empty stdin — single-provider arg now picks the registry-default model.- 43 new pytests. Coverage for captcha policy (12), captcha solver (15), and
agent_listener.py(16). All PASS. The invariant proven foragent_listener.on_task_assigned: one bad event must NOT kill the autonomous listener loop. higgsfield_generatetool. Lucy can drive a CDP-attached Chrome session to generate Higgsfield images/videos.- Display-clobber on streaming output: fixed.
_streaming_output_context()is finally wired into the stream path soprompt_toolkit.patch_stdoutkeeps tokens above the input buffer instead of into it.
Previously in v1.8.1: Multi-agent fleet infrastructure (4 keystone specs, 7 agents, cross-agent handoff 5/5), SA-2026-002 security audit (32 fixes / 29 commits), Lucy autonomy with multi-root path confinement.
Install
Pick the path that matches your environment. All paths install the same package from PyPI.
One-line installer (macOS / Linux / WSL)
curl -sSL https://myaibrain.org/install | sh
Creates an isolated venv at ~/.aibrain/venv, pip-installs aibrain, and symlinks the CLI into /usr/local/bin (or ~/.local/bin fallback). Re-run any time to upgrade. Python 3.10+ required.
One-line installer (Windows PowerShell)
irm https://myaibrain.org/install.ps1 | iex
Creates an isolated venv at %USERPROFILE%\.aibrain\venv, pip-installs aibrain, and adds the venv Scripts dir to your user PATH. Python 3.10+ required.
Homebrew (macOS / Linux)
brew tap sindecker/tap
brew install aibrain
Installs into a Homebrew-managed venv and symlinks the CLI.
pip (any platform)
pip install aibrain
Docker
docker pull sindecker/aibrain:latest
Telegram Quickstart (BotFather)
Lucy can run as a Telegram bot you can DM.
-
Open Telegram, message @BotFather, send
/newbot, choose a name and username. BotFather returns an HTTP API token — copy it. -
Set the bot token as an environment variable:
- Linux/macOS:
export LUCY_TELEGRAM_TOKEN=<your-token> - Windows Command Prompt:
set LUCY_TELEGRAM_TOKEN=<your-token> - PowerShell:
$env:LUCY_TELEGRAM_TOKEN="<your-token>"
- Linux/macOS:
-
Set your DeepSeek API key (Lucy is currently DeepSeek (proven); broader provider support in progress):
- Linux/macOS:
export DEEPSEEK_API_KEY=<your-key> - Windows Command Prompt:
set DEEPSEEK_API_KEY=<your-key> - PowerShell:
$env:DEEPSEEK_API_KEY="<your-key>"
Alternative: OpenRouter (one key, 300+ models). OpenRouter proxies Anthropic, Google Gemini, xAI Grok, Meta Llama, DeepSeek, OpenAI and others behind a single key. Get one at https://openrouter.ai/keys, then:
- Linux/macOS:
export OPENROUTER_API_KEY=<your-key> - Windows Command Prompt:
set OPENROUTER_API_KEY=<your-key> - PowerShell:
$env:OPENROUTER_API_KEY="<your-key>"
Run Lucy against any OpenRouter-hosted model by passing the slug:
lucy --provider openrouter --model google/gemini-2.5-flash lucy --provider openrouter --model anthropic/claude-sonnet-4-5 lucy --provider openrouter --model x-ai/grok-2
Alternative: direct Google Gemini (free tier). Google AI Studio gives away a generous free tier — 1500 requests/day on
gemini-2.5-flashand 50 requests/day ongemini-2.5-proat zero cost. Get a free key at https://aistudio.google.com/app/apikey, then:- Linux/macOS:
export GEMINI_API_KEY=<your-key> - Windows Command Prompt:
set GEMINI_API_KEY=<your-key> - PowerShell:
$env:GEMINI_API_KEY="<your-key>"
Run Lucy on Gemini directly:
lucy --provider gemini --model gemini-2.5-flash lucy --provider gemini --model gemini-2.5-pro
- Linux/macOS:
-
Start the Telegram poller:
python -m aibrain.lucy_telegram
-
DM your new bot on Telegram. Messages will route through the real Lucy agent.
Run Lucy in the terminal instead
Pick a provider for THIS launch with a single flag — no config edit required:
lucy --provider openrouter --model google/gemini-2.5-flash # OpenRouter (300+ models behind one key)
lucy --provider gemini --model gemini-2.5-flash # Direct Google Gemini (free tier)
lucy --provider claude_cli # Subscription Claude (Claude Code CLI)
lucy --provider deepseek # DeepSeek API
lucy --provider openai --model gpt-4o # OpenAI API
lucy --provider anthropic --model claude-sonnet-4-6 # Anthropic API
lucy --provider ollama --model llama3.2:3b # Local Ollama (no key)
--provider accepts: claude_cli, openai, deepseek, ollama, anthropic, openrouter, gemini. --model is optional — a sensible default is chosen per provider when omitted. The flag overrides ~/.aibrain/config.json for that launch only.
Plain lucy (no flags) falls through to the existing config-based detection: the provider in ~/.aibrain/config.json, otherwise the first matching *_API_KEY env var. To make a choice persistent, set default_provider and default_model in ~/.aibrain/config.json or pick once via the in-session /switch command.
For autonomous goal-driven runs, start an interactive session and use the /loop command:
/loop --project <name> --goal "<goal>"
Note: Lucy does not accept a goal as a command-line argument. Always use the /loop command inside the interactive session for autonomous tasks.
Provider Configuration
Lucy speaks to 7 LLM providers behind a single --provider flag. Pick any one at install-time, swap mid-session with /switch, or set a persistent default in ~/.aibrain/config.json. No code edits required.
Provider matrix
| Provider | Auth Method | Env Var | Default Model | Notes |
|---|---|---|---|---|
| claude_cli | Claude Code subscription | (none — uses local CLI binary) | claude-opus-4-7 |
Recommended for most users — no key needed, charged against your Claude subscription |
| openai | OpenAI API key | OPENAI_API_KEY |
gpt-4o |
Standard OpenAI billing |
| deepseek | DeepSeek API key | DEEPSEEK_API_KEY |
deepseek-chat |
Cheapest paid option; default for Lucy (proven) |
| ollama | Local — no auth | (none — local daemon) | llama3.2:3b |
Zero cost, runs on your machine; ollama serve must be running |
| anthropic | Anthropic API key | ANTHROPIC_API_KEY |
claude-sonnet-4-5 |
Paid Anthropic API (separate from Claude Code subscription) |
| openrouter | OpenRouter API key | OPENROUTER_API_KEY |
google/gemini-2.5-flash |
One key, 300+ models (Anthropic, Google, xAI, Meta, DeepSeek, OpenAI…) |
| gemini | Google AI Studio key | GEMINI_API_KEY |
gemini-2.5-flash |
Free tier: 1500 req/day on gemini-2.5-flash, 50 req/day on gemini-2.5-pro |
One-line CLI examples
lucy --provider claude_cli # your Claude subscription
lucy --provider openai --model gpt-4o # OpenAI API
lucy --provider deepseek # DeepSeek API (default)
lucy --provider ollama --model llama3.2:3b # local Ollama
lucy --provider anthropic --model claude-sonnet-4-5 # Anthropic paid API
lucy --provider openrouter --model google/gemini-2.5-flash # 300+ models via one key
lucy --provider gemini --model gemini-2.5-flash # Google free tier
Env var setup — newer providers
OpenRouter (one key, 300+ models — get yours at https://openrouter.ai/keys):
# Linux/macOS
export OPENROUTER_API_KEY=<your-key>
# Windows PowerShell
$env:OPENROUTER_API_KEY="<your-key>"
:: Windows CMD
set OPENROUTER_API_KEY=<your-key>
Google Gemini (free tier — get yours at https://aistudio.google.com/app/apikey):
# Linux/macOS
export GEMINI_API_KEY=<your-key>
# Windows PowerShell
$env:GEMINI_API_KEY="<your-key>"
:: Windows CMD
set GEMINI_API_KEY=<your-key>
The remaining providers follow the same pattern with OPENAI_API_KEY, DEEPSEEK_API_KEY, and ANTHROPIC_API_KEY respectively. claude_cli and ollama need no env var — they read from local subscription state or a local daemon.
In-session commands
Inside an interactive lucy session:
/switch <provider> [<model>]— swap provider/model mid-conversation, no restart needed/providers— list available providers detected from your env / config/pricing— show current per-token cost for the active provider/model/loop --project <name> --goal "<goal>"— kick off an autonomous goal-driven run
Cost visibility
Run aibrain usage to see per-provider spend across the last 30 days — broken down by provider, model, and call count. Use this to spot which provider is eating your budget and /switch to a cheaper one mid-session if needed.
Run Claude Code itself on an alternate backend
The provider settings above choose which LLM Lucy/AIBrain talks to. If you instead want to run the Claude Code CLI itself against a non-Anthropic model, the bundled launchers in aibrain/integrations/claude_code_byob/ point Claude Code at DeepSeek's or MiniMax's Anthropic-compatible endpoint:
# DeepSeek (set DEEPSEEK_API_KEY, or put it in a .env next to the launcher)
./aibrain/integrations/claude_code_byob/claude-deepseek.sh
# MiniMax (set MINIMAX_API_KEY)
./aibrain/integrations/claude_code_byob/claude-minimax.sh
Each runs in an isolated CLAUDE_CONFIG_DIR and pins every model slot — including the background ANTHROPIC_SMALL_FAST_MODEL — so no request silently falls back to a cheaper tier. Windows .bat equivalents are included. Standalone/public mirror: github.com/sindecker/claude-code-byob (MIT).
Quick Start
# Install
pip install aibrain
# Initialize your brain
aibrain init
# Talk to Lucy on the provider of your choice (one line, no config edit)
lucy --provider openrouter --model google/gemini-2.5-flash # 300+ models, one key
lucy --provider gemini --model gemini-2.5-flash # Google free tier
lucy --provider claude_cli # Subscription Claude
lucy --provider ollama --model llama3.2:3b # Local, no key
# Start the dashboard server
aibrain serve
# Open the dashboard
open http://localhost:3000
Your agent now has persistent memory. Every conversation, every workflow, every decision is stored and retrievable. Run aibrain dream weekly to consolidate patterns and improve retrieval.
Persistent default: Plain lucy (no flags) reads ~/.aibrain/config.json. Set default_provider + default_model there, or use the in-session /switch command, to make your choice stick across launches. CLI flags always win for the current launch.
Benchmark Results
AIBrain's SelRoute retrieval system has been evaluated on 62,792+ instances across multiple benchmarks. All results are from verified JSON files in the benchmarks/ directory.
LongMemEval M (500 instances)
| System | Parameters | Ra@5 | NDCG@5 |
|---|---|---|---|
| SelRoute bge-base (metadata routing) | 109M | 0.79 | 0.812 |
| SelRoute bge-small (metadata routing) | 33M | 0.786 | 0.718 |
| SelRoute FTS5 (zero-ML, zero-GPU) | 0 | 0.745 | 0.692 |
| all-MiniLM-L6-v2 | 22M | 0.785 | 0.717 |
LongMemEval S (500 instances)
| System | Parameters | Ra@5 |
|---|---|---|
| SelRoute bge-base | 109M | 0.920 |
| SelRoute Oracle | — | 0.992 |
MSDialog (2,199 tech-support dialogues)
| System | Parameters | Ra@5 |
|---|---|---|
| SelRoute MiniLM | 22M | 0.998 |
LoCoMo (1,986 QA pairs)
| System | Parameters | Recall@5 | Ra@5 |
|---|---|---|---|
| SelRoute FTS5 (zero-ML) | 0 | 0.859 | 0.767 |
QReCC (52,678 conversational queries)
| System | Parameters | MRR |
|---|---|---|
| SelRoute FTS5+reasoning | 0 | 51.66 |
LMEB dialogue (840 instances)
| System | Parameters | NDCG@5 |
|---|---|---|
| SelRoute FTS5 (zero-ML) | 0 | 0.971 |
Key findings:
- A 22MB MiniLM model achieves Ra@5 = 0.785 on LongMemEval M — competitive retrieval with a model that fits in RAM on any device.
- A zero-parameter FTS5 retriever achieves NDCG@5 = 0.971 on LMEB dialogue — no neural parameters, no GPU, no training data.
- All benchmarks run on a consumer laptop. No GPU required.
Architecture
Complementary Learning Systems (CLS)
AIBrain implements a dual-system memory architecture inspired by the mammalian brain:
- Hippocampal fast encoding. Every session writes immediately to short-term memory. No indexing delay, no batch processing. Your agent remembers what just happened.
- Neocortical consolidation. A weekly
aibrain dreamcycle slow-extracts patterns from accumulated sessions, upgrades routing weights, and consolidates long-term knowledge. The brain gets measurably better at subsequent tasks. - SelRoute routing. The SelRoute system (arXiv:2604.02431) routes each query to the optimal retrieval strategy — dense embedding, sparse FTS5, or hybrid — based on query characteristics. This is what enables a 22MB model to match 1.5B-parameter systems.
Boss Agent
Multi-agent orchestration with one orchestrator and multiple isolated workers sharing a single brain. Each worker has its own context, memory, and tool access, but all share the same persistent knowledge base.
Companies / RBAC
Full organizational hierarchy — agents, tasks, roles, and approval flows. Manage team access, delegate tasks, and enforce governance policies.
Brain Marketplace
Share or sell trained brains via git. Export your brain, push it to a repository, and let others import it. Brains carry learned patterns, routing weights, and consolidated knowledge.
Satellite DBs
Federated search across multiple brain instances. Query one brain and get results from all connected brains.
Pricing
| Tier | Price | Features |
|---|---|---|
| Free | $0 | Unlimited local usage. All features. No cloud dependency. |
| Pro | $9.95/mo | Priority support, early access to new features, cloud sync. |
| Team | $29.95/mo | Everything in Pro, plus RBAC, audit logs, dedicated support. |
All tiers include the same core AIBrain software. The difference is support level and cloud features.
CLI Entrypoints
aibrain— Main CLIaibrain-server— Start the backend serveraibrain-mcp— MCP serveraibrain-compress— SelRoute compression library (50-99% token savings on git/build/test output)aibrain-settings— Configure AIBrainaibrain-demo— Run a demo
License
Proprietary. See LICENSE file for details.
Contributing
See CONTRIBUTING.md for development setup and contribution guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aibrain-1.8.10-py3-none-any.whl.
File metadata
- Download URL: aibrain-1.8.10-py3-none-any.whl
- Upload date:
- Size: 4.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc164bd69f12b8ab6bf18246923c681ef1026a8b739307c716e444f4d64cac16
|
|
| MD5 |
bf6e9ca445898d1430cfddc423345945
|
|
| BLAKE2b-256 |
36eb743d585fad1a0fbfb3a5ac13a8c5da29cca654d5c99a739ba50554b9f381
|