Skip to main content

AI agent brain with memory, teams, flows, document ingestion, and MCP — your agent, but better every day

Project description

AIBrain — Self-compounding agent substrate with persistent memory and multi-agent fleet

79% recall on LongMemEval M with a 109M model. 99.8% on MSDialog. Zero-parameter FTS5 achieves 96.9% NDCG@5 on dialogue retrieval. All on a consumer laptop, no GPU required. One install. 100 workflows. Agent teams. Flow engine. Document ingestion. Universal MCP. Dual-system memory that compounds across sessions. Runs locally, no cloud lock-in.

AIBrain is a self-hosted memory substrate and fleet orchestration layer for AI agents. It gives any agent persistent memory, typed task composition, and inter-agent communication — designed to install alongside agents like Hermes, not replace them. The architectural bet: compounding comes from a fleet of cooperating agents, not from one agent alone.

Shipped: Dual-system memory (CLS), 100 workflows, multi-model routing, 68-page dashboard, MCP server, inter-agent comms backbone (CloudEvents pub/sub), agent spawn infrastructure. Specced, filed, targeted v1.7-v1.9: Cross-agent handoff verification loop, agent skill marketplace integration, fleet compounding demonstrations. Design target v2.0: Self-compounding fleet.

AIBrain is a self-hosted operating system for AI agents. It gives any agent persistent memory, typed Agent/Task/Team composition, a decorator-driven Flow engine, document ingestion, universal MCP client connectivity, a reactive workflow engine, a Complementary Learning Systems (CLS) cognitive substrate with a weekly consolidation cycle, multi-model LLM routing, an approval queue, inter-agent messaging, and 100 ready-to-run workflows — all behind a 68-page Next.js dashboard. Deploy it on a laptop, a VPS, or in Docker; your agent carries its entire brain with it.

AIBrain License Python Tests Workflows Dashboard


Why AIBrain?

Most AI memory systems are toys. They store everything, retrieve nothing useful, and require expensive GPUs to run. AIBrain is different:

  • Verified retrieval performance. On LongMemEval M (500 instances, the standard benchmark for long-term conversational memory), AIBrain's SelRoute system achieves Ra@5 = 0.79 with a 109M bge-base model (0.7915 exact, honest round) — beating the strongest published baseline (Contriever + LLM fact keys, 0.762) by +0.029 on recall and +0.180 on NDCG@5. A 22MB MiniLM model achieves Ra@5 = 0.785, statistically equivalent to models 50% larger. The zero-parameter FTS5 baseline (zero trainable parameters, zero GPU) achieves NDCG@5 = 0.692 on LongMemEval M, exceeding every published system including 1.5B-parameter models.
  • Near-perfect on domain-specific retrieval. On MSDialog (2,199 tech-support dialogues), AIBrain achieves Ra@5 = 0.998 with a 22MB MiniLM model — near-perfect retrieval for technical support contexts.
  • Zero-parameter dialogue retrieval. On LMEB dialogue (840 instances), the FTS5 zero-ML retriever achieves NDCG@5 = 0.971 — no neural parameters, no GPU, no training data.
  • Total evaluation instances: 62,792+. Every number is from verified JSON files in the benchmarks/ directory. The methodology is described in the peer-reviewed SelRoute paper (McKee, 2026, arXiv:2604.02431).
  • All benchmarks run on a consumer laptop. No GPU required. No cloud credits. No special hardware.

The secret is the CLS architecture: a Complementary Learning Systems dual-system memory inspired by the mammalian brain. Every session writes to fast hippocampal memory. A weekly aibrain dream consolidation cycle slow-extracts patterns and upgrades routing weights. The brain gets measurably better at subsequent tasks — not just stores more.


What's New in v1.8.8

  • Memory recall ranking - precision restored and hardened. The semantic retrieval channel was mis-ranking known-item queries because the salience re-rank was a no-op on the hybrid/RRF routes. Fixed by fusing the FTS/vector scores through RRF and applying exact distinctive-term overlap as the decisive signal for keyword/known-item queries. Measured on the recall eval (n=20, seed=42): MRR@5 0.71 -> 0.875, hit@1 0.55 -> 0.85. A permanent regression test now guards the floor.
  • Run Claude Code on an alternate backend. New bundled launchers (aibrain/integrations/claude_code_byob/) run the Claude Code CLI against DeepSeek or MiniMax over their Anthropic-compatible endpoints, with every model slot pinned (including the background small/fast model). Also released standalone at github.com/sindecker/claude-code-byob (MIT).

Previously in v1.8.7:

  • Tool timeout lifted 30s -> 1800s (env-configurable via AIBRAIN_HANDS_TOOL_TIMEOUT) - removes the single biggest source of mid-task drop-outs.
  • DB write-lock relief - high-write tables split onto a separate fleet_comms.db; memory-consolidation clustering rewritten for large (50K+ memory) brains.
  • Liaison review falls back to DeepSeek when the primary reviewer errors; goal fan-out orchestrator added; self-imposed DeepSeek output cap removed.
  • New: GET /api/memory/list, an aibrain doctor diagnostic verb, and a Tauri 2.x native desktop scaffold (M0).

What's New in v1.8.5

  • Lucy CLI: paste-fragmentation root cause fix. prompt_toolkit 3.0.52 silently rejected the enable_bracketed_paste kwarg as TypeError — swallowed by a bare exception, hiding paste-corruption-during-streaming for months. Removed the kwarg; the bracketed-paste keybinding handles paste detection on its own. Typing while Lucy is streaming a reply no longer corrupts the input buffer.
  • Captcha solver layer + default-allow policy. New aibrain/lucy/captcha/ module with stealth-first detection (recaptcha v2/v3, Turnstile, hCaptcha) and pluggable solvers (CDP attach → CapSolver REST fallback → manual-Telegram refuse). Policy default: solve on any host that hits a captcha. Hard-deny mechanism is preserved as opt-in but empty by default — LinkedIn / Indeed / Reddit job-search flows just work.
  • /help, /version, /exit commands. /help lists all 21 wired commands grouped by category. /version shows aibrain.__version__, current git HEAD, and the loaded module path (catches stale-daemon issues fast). /exit prints a one-line session summary on quit.
  • Session save / load / memory-recall failures now surface visibly. Three formerly-silent except Exception: pass swallows were replaced with logged warnings at _log_tool level 1.
  • /switch <provider> works in non-TTY contexts. Tests, scripts, and autonomous /goal runs no longer hang on empty stdin — single-provider arg now picks the registry-default model.
  • 43 new pytests. Coverage for captcha policy (12), captcha solver (15), and agent_listener.py (16). All PASS. The invariant proven for agent_listener.on_task_assigned: one bad event must NOT kill the autonomous listener loop.
  • higgsfield_generate tool. Lucy can drive a CDP-attached Chrome session to generate Higgsfield images/videos.
  • Display-clobber on streaming output: fixed. _streaming_output_context() is finally wired into the stream path so prompt_toolkit.patch_stdout keeps tokens above the input buffer instead of into it.

Previously in v1.8.1: Multi-agent fleet infrastructure (4 keystone specs, 7 agents, cross-agent handoff 5/5), SA-2026-002 security audit (32 fixes / 29 commits), Lucy autonomy with multi-root path confinement.

Install

Pick the path that matches your environment. All paths install the same package from PyPI.

One-line installer (macOS / Linux / WSL)

curl -sSL https://myaibrain.org/install | sh

Creates an isolated venv at ~/.aibrain/venv, pip-installs aibrain, and symlinks the CLI into /usr/local/bin (or ~/.local/bin fallback). Re-run any time to upgrade. Python 3.10+ required.

One-line installer (Windows PowerShell)

irm https://myaibrain.org/install.ps1 | iex

Creates an isolated venv at %USERPROFILE%\.aibrain\venv, pip-installs aibrain, and adds the venv Scripts dir to your user PATH. Python 3.10+ required.

Homebrew (macOS / Linux)

brew tap sindecker/tap
brew install aibrain

Installs into a Homebrew-managed venv and symlinks the CLI.

pip (any platform)

pip install aibrain

Docker

docker pull sindecker/aibrain:latest

Telegram Quickstart (BotFather)

Lucy can run as a Telegram bot you can DM.

  1. Open Telegram, message @BotFather, send /newbot, choose a name and username. BotFather returns an HTTP API token — copy it.

  2. Set the bot token as an environment variable:

    • Linux/macOS: export LUCY_TELEGRAM_TOKEN=<your-token>
    • Windows Command Prompt: set LUCY_TELEGRAM_TOKEN=<your-token>
    • PowerShell: $env:LUCY_TELEGRAM_TOKEN="<your-token>"
  3. Set your DeepSeek API key (Lucy is currently DeepSeek (proven); broader provider support in progress):

    • Linux/macOS: export DEEPSEEK_API_KEY=<your-key>
    • Windows Command Prompt: set DEEPSEEK_API_KEY=<your-key>
    • PowerShell: $env:DEEPSEEK_API_KEY="<your-key>"

    Alternative: OpenRouter (one key, 300+ models). OpenRouter proxies Anthropic, Google Gemini, xAI Grok, Meta Llama, DeepSeek, OpenAI and others behind a single key. Get one at https://openrouter.ai/keys, then:

    • Linux/macOS: export OPENROUTER_API_KEY=<your-key>
    • Windows Command Prompt: set OPENROUTER_API_KEY=<your-key>
    • PowerShell: $env:OPENROUTER_API_KEY="<your-key>"

    Run Lucy against any OpenRouter-hosted model by passing the slug:

    lucy --provider openrouter --model google/gemini-2.5-flash
    lucy --provider openrouter --model anthropic/claude-sonnet-4-5
    lucy --provider openrouter --model x-ai/grok-2
    

    Alternative: direct Google Gemini (free tier). Google AI Studio gives away a generous free tier — 1500 requests/day on gemini-2.5-flash and 50 requests/day on gemini-2.5-pro at zero cost. Get a free key at https://aistudio.google.com/app/apikey, then:

    • Linux/macOS: export GEMINI_API_KEY=<your-key>
    • Windows Command Prompt: set GEMINI_API_KEY=<your-key>
    • PowerShell: $env:GEMINI_API_KEY="<your-key>"

    Run Lucy on Gemini directly:

    lucy --provider gemini --model gemini-2.5-flash
    lucy --provider gemini --model gemini-2.5-pro
    
  4. Start the Telegram poller:

    python -m aibrain.lucy_telegram
    
  5. DM your new bot on Telegram. Messages will route through the real Lucy agent.

Run Lucy in the terminal instead

Pick a provider for THIS launch with a single flag — no config edit required:

lucy --provider openrouter --model google/gemini-2.5-flash    # OpenRouter (300+ models behind one key)
lucy --provider gemini --model gemini-2.5-flash               # Direct Google Gemini (free tier)
lucy --provider claude_cli                                     # Subscription Claude (Claude Code CLI)
lucy --provider deepseek                                       # DeepSeek API
lucy --provider openai --model gpt-4o                          # OpenAI API
lucy --provider anthropic --model claude-sonnet-4-6            # Anthropic API
lucy --provider ollama --model llama3.2:3b                     # Local Ollama (no key)

--provider accepts: claude_cli, openai, deepseek, ollama, anthropic, openrouter, gemini. --model is optional — a sensible default is chosen per provider when omitted. The flag overrides ~/.aibrain/config.json for that launch only.

Plain lucy (no flags) falls through to the existing config-based detection: the provider in ~/.aibrain/config.json, otherwise the first matching *_API_KEY env var. To make a choice persistent, set default_provider and default_model in ~/.aibrain/config.json or pick once via the in-session /switch command.

For autonomous goal-driven runs, start an interactive session and use the /loop command:

/loop --project <name> --goal "<goal>"

Note: Lucy does not accept a goal as a command-line argument. Always use the /loop command inside the interactive session for autonomous tasks.


Provider Configuration

Lucy speaks to 7 LLM providers behind a single --provider flag. Pick any one at install-time, swap mid-session with /switch, or set a persistent default in ~/.aibrain/config.json. No code edits required.

Provider matrix

Provider Auth Method Env Var Default Model Notes
claude_cli Claude Code subscription (none — uses local CLI binary) claude-opus-4-7 Recommended for most users — no key needed, charged against your Claude subscription
openai OpenAI API key OPENAI_API_KEY gpt-4o Standard OpenAI billing
deepseek DeepSeek API key DEEPSEEK_API_KEY deepseek-chat Cheapest paid option; default for Lucy (proven)
ollama Local — no auth (none — local daemon) llama3.2:3b Zero cost, runs on your machine; ollama serve must be running
anthropic Anthropic API key ANTHROPIC_API_KEY claude-sonnet-4-5 Paid Anthropic API (separate from Claude Code subscription)
openrouter OpenRouter API key OPENROUTER_API_KEY google/gemini-2.5-flash One key, 300+ models (Anthropic, Google, xAI, Meta, DeepSeek, OpenAI…)
gemini Google AI Studio key GEMINI_API_KEY gemini-2.5-flash Free tier: 1500 req/day on gemini-2.5-flash, 50 req/day on gemini-2.5-pro

One-line CLI examples

lucy --provider claude_cli                                      # your Claude subscription
lucy --provider openai --model gpt-4o                           # OpenAI API
lucy --provider deepseek                                        # DeepSeek API (default)
lucy --provider ollama --model llama3.2:3b                      # local Ollama
lucy --provider anthropic --model claude-sonnet-4-5             # Anthropic paid API
lucy --provider openrouter --model google/gemini-2.5-flash      # 300+ models via one key
lucy --provider gemini --model gemini-2.5-flash                 # Google free tier

Env var setup — newer providers

OpenRouter (one key, 300+ models — get yours at https://openrouter.ai/keys):

# Linux/macOS
export OPENROUTER_API_KEY=<your-key>
# Windows PowerShell
$env:OPENROUTER_API_KEY="<your-key>"
:: Windows CMD
set OPENROUTER_API_KEY=<your-key>

Google Gemini (free tier — get yours at https://aistudio.google.com/app/apikey):

# Linux/macOS
export GEMINI_API_KEY=<your-key>
# Windows PowerShell
$env:GEMINI_API_KEY="<your-key>"
:: Windows CMD
set GEMINI_API_KEY=<your-key>

The remaining providers follow the same pattern with OPENAI_API_KEY, DEEPSEEK_API_KEY, and ANTHROPIC_API_KEY respectively. claude_cli and ollama need no env var — they read from local subscription state or a local daemon.

In-session commands

Inside an interactive lucy session:

  • /switch <provider> [<model>] — swap provider/model mid-conversation, no restart needed
  • /providers — list available providers detected from your env / config
  • /pricing — show current per-token cost for the active provider/model
  • /loop --project <name> --goal "<goal>" — kick off an autonomous goal-driven run

Cost visibility

Run aibrain usage to see per-provider spend across the last 30 days — broken down by provider, model, and call count. Use this to spot which provider is eating your budget and /switch to a cheaper one mid-session if needed.

Run Claude Code itself on an alternate backend

The provider settings above choose which LLM Lucy/AIBrain talks to. If you instead want to run the Claude Code CLI itself against a non-Anthropic model, the bundled launchers in aibrain/integrations/claude_code_byob/ point Claude Code at DeepSeek's or MiniMax's Anthropic-compatible endpoint:

# DeepSeek (set DEEPSEEK_API_KEY, or put it in a .env next to the launcher)
./aibrain/integrations/claude_code_byob/claude-deepseek.sh
# MiniMax (set MINIMAX_API_KEY)
./aibrain/integrations/claude_code_byob/claude-minimax.sh

Each runs in an isolated CLAUDE_CONFIG_DIR and pins every model slot — including the background ANTHROPIC_SMALL_FAST_MODEL — so no request silently falls back to a cheaper tier. Windows .bat equivalents are included. Standalone/public mirror: github.com/sindecker/claude-code-byob (MIT).


Quick Start

# Install
pip install aibrain

# Initialize your brain
aibrain init

# Talk to Lucy on the provider of your choice (one line, no config edit)
lucy --provider openrouter --model google/gemini-2.5-flash      # 300+ models, one key
lucy --provider gemini --model gemini-2.5-flash                 # Google free tier
lucy --provider claude_cli                                       # Subscription Claude
lucy --provider ollama --model llama3.2:3b                       # Local, no key

# Start the dashboard server
aibrain serve

# Open the dashboard
open http://localhost:3000

Your agent now has persistent memory. Every conversation, every workflow, every decision is stored and retrievable. Run aibrain dream weekly to consolidate patterns and improve retrieval.

Persistent default: Plain lucy (no flags) reads ~/.aibrain/config.json. Set default_provider + default_model there, or use the in-session /switch command, to make your choice stick across launches. CLI flags always win for the current launch.


Benchmark Results

AIBrain's SelRoute retrieval system has been evaluated on 62,792+ instances across multiple benchmarks. All results are from verified JSON files in the benchmarks/ directory.

LongMemEval M (500 instances)

System Parameters Ra@5 NDCG@5
SelRoute bge-base (metadata routing) 109M 0.79 0.812
SelRoute bge-small (metadata routing) 33M 0.786 0.718
SelRoute FTS5 (zero-ML, zero-GPU) 0 0.745 0.692
all-MiniLM-L6-v2 22M 0.785 0.717

LongMemEval S (500 instances)

System Parameters Ra@5
SelRoute bge-base 109M 0.920
SelRoute Oracle 0.992

MSDialog (2,199 tech-support dialogues)

System Parameters Ra@5
SelRoute MiniLM 22M 0.998

LoCoMo (1,986 QA pairs)

System Parameters Recall@5 Ra@5
SelRoute FTS5 (zero-ML) 0 0.859 0.767

QReCC (52,678 conversational queries)

System Parameters MRR
SelRoute FTS5+reasoning 0 51.66

LMEB dialogue (840 instances)

System Parameters NDCG@5
SelRoute FTS5 (zero-ML) 0 0.971

Key findings:

  1. A 22MB MiniLM model achieves Ra@5 = 0.785 on LongMemEval M — competitive retrieval with a model that fits in RAM on any device.
  2. A zero-parameter FTS5 retriever achieves NDCG@5 = 0.971 on LMEB dialogue — no neural parameters, no GPU, no training data.
  3. All benchmarks run on a consumer laptop. No GPU required.

Architecture

Complementary Learning Systems (CLS)

AIBrain implements a dual-system memory architecture inspired by the mammalian brain:

  • Hippocampal fast encoding. Every session writes immediately to short-term memory. No indexing delay, no batch processing. Your agent remembers what just happened.
  • Neocortical consolidation. A weekly aibrain dream cycle slow-extracts patterns from accumulated sessions, upgrades routing weights, and consolidates long-term knowledge. The brain gets measurably better at subsequent tasks.
  • SelRoute routing. The SelRoute system (arXiv:2604.02431) routes each query to the optimal retrieval strategy — dense embedding, sparse FTS5, or hybrid — based on query characteristics. This is what enables a 22MB model to match 1.5B-parameter systems.

Boss Agent

Multi-agent orchestration with one orchestrator and multiple isolated workers sharing a single brain. Each worker has its own context, memory, and tool access, but all share the same persistent knowledge base.

Companies / RBAC

Full organizational hierarchy — agents, tasks, roles, and approval flows. Manage team access, delegate tasks, and enforce governance policies.

Brain Marketplace

Share or sell trained brains via git. Export your brain, push it to a repository, and let others import it. Brains carry learned patterns, routing weights, and consolidated knowledge.

Satellite DBs

Federated search across multiple brain instances. Query one brain and get results from all connected brains.


Pricing

Tier Price Features
Free $0 Unlimited local usage. All features. No cloud dependency.
Pro $9.95/mo Priority support, early access to new features, cloud sync.
Team $29.95/mo Everything in Pro, plus RBAC, audit logs, dedicated support.

All tiers include the same core AIBrain software. The difference is support level and cloud features.


CLI Entrypoints

  • aibrain — Main CLI
  • aibrain-server — Start the backend server
  • aibrain-mcp — MCP server
  • aibrain-compress — SelRoute compression library (50-99% token savings on git/build/test output)
  • aibrain-settings — Configure AIBrain
  • aibrain-demo — Run a demo

License

Proprietary. See LICENSE file for details.


Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aibrain-1.8.10-py3-none-any.whl (4.2 MB view details)

Uploaded Python 3

File details

Details for the file aibrain-1.8.10-py3-none-any.whl.

File metadata

  • Download URL: aibrain-1.8.10-py3-none-any.whl
  • Upload date:
  • Size: 4.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for aibrain-1.8.10-py3-none-any.whl
Algorithm Hash digest
SHA256 dc164bd69f12b8ab6bf18246923c681ef1026a8b739307c716e444f4d64cac16
MD5 bf6e9ca445898d1430cfddc423345945
BLAKE2b-256 36eb743d585fad1a0fbfb3a5ac13a8c5da29cca654d5c99a739ba50554b9f381

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page