Skip to main content

An AI agent OS with a 6-engine memory flywheel โ€” knowledge that compounds, not decays.

Project description

๐Ÿฆด Caveman โ€” The Self-Evolving AI Agent Framework

An AI agent that learns, remembers, and improves itself.

Caveman is an AI agent operating system built around a 6-engine memory flywheel. Unlike agents that forget between sessions, Caveman's knowledge gets richer, more confident, and more useful over time. It audits its own code, learns skills from experience, and compiles knowledge into a structured wiki โ€” all automatically.

What Makes Caveman Different

Most agent frameworks are static tools: you build them, they run, they forget. Caveman is a living system:

  • Self-Evolving Skills โ€” The Reflect engine learns patterns from every completed task, creates skills, and evolves them over time
  • 3-Layer Knowledge Pyramid โ€” A Wiki Compiler (inspired by Karpathy's LLM Wiki) distills conversations into structured, tiered knowledge
  • Knowledge Drift Detection โ€” Detects when memories become stale or contradictory, weakens outdated knowledge automatically
  • Self-Auditing Flywheel โ€” Audits and fixes its own code: find bugs โ†’ fix โ†’ test โ†’ commit โ†’ learn
  • 30 Built-in Tools โ€” From memory search to MCP client to process management to browser automation
  • MCP Ecosystem โ€” Both server and client, connecting to thousands of external tools

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      CAVEMAN AGENT OS                        โ”‚
โ”‚                                                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚  Agent   โ”‚   โ”‚ Engines  โ”‚   โ”‚  Memory  โ”‚   โ”‚  Tools   โ”‚  โ”‚
โ”‚  โ”‚  Loop    โ”‚โ”€โ”€โ–ถโ”‚ (6-core) โ”‚โ”€โ”€โ–ถโ”‚ (SQLite  โ”‚โ—€โ”€โ”€โ”‚ (30      โ”‚  โ”‚
โ”‚  โ”‚         โ”‚   โ”‚          โ”‚   โ”‚  +FTS5)  โ”‚   โ”‚ built-in)โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚       โ”‚                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚Compress-โ”‚   โ”‚  Wiki    โ”‚   โ”‚ Training โ”‚   โ”‚ Gateway  โ”‚  โ”‚
โ”‚  โ”‚  ion    โ”‚   โ”‚ Compiler โ”‚   โ”‚ Pipeline โ”‚   โ”‚ (TG/DC)  โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚                                                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚   MCP   โ”‚   โ”‚ Security โ”‚   โ”‚  Bridge  โ”‚   โ”‚Coordinat-โ”‚  โ”‚
โ”‚  โ”‚ Server  โ”‚   โ”‚ (sandbox โ”‚   โ”‚ (Hermes/ โ”‚   โ”‚  or      โ”‚  โ”‚
โ”‚  โ”‚+Client  โ”‚   โ”‚  +crypto)โ”‚   โ”‚ OpenClaw)โ”‚   โ”‚          โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Quick Start

pip install caveman-agent
caveman run "What files are in this directory?"

Or interactive mode:

caveman run -i

Other commands:

caveman status          # Dashboard: engines, memory, skills
caveman skills          # List learned skills
caveman flywheel        # Run self-improvement loop
caveman wiki status     # Knowledge stats per tier
caveman wiki compile    # Compile knowledge (promote + expire)
caveman audit           # Static code quality checks
caveman bench           # Memory performance benchmarks
caveman self-test       # Full lifecycle verification

Cognitive Engines

The 6 engines form a continuous learning flywheel:

Shield โ†’ Nudge โ†’ Reflect โ†’ Ripple โ†’ Lint โ†’ Recall
  โ†‘                                           โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ continuous loop โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Engine Purpose
Shield Preserves conversation essence across context compressions
Nudge Extracts new knowledge from every interaction
Reflect Learns skills from completed tasks, evolves them over time
Ripple Propagates knowledge updates across related memories
Lint Detects stale or contradicted knowledge, weakens confidence
Recall Restores relevant context at session start

Plus a scheduler that orchestrates engine execution based on priority and resource availability.

Wiki Compiler

Inspired by Karpathy's LLM Wiki, the Wiki Compiler organizes all knowledge into a 4-tier pyramid:

Procedural  โ† workflows and patterns (months, never expires)
Semantic    โ† cross-session facts (weeks)
Episodic    โ† session summaries (days)
Working     โ† recent observations (hours)

Knowledge automatically promotes upward as it proves useful and expires downward when stale.

Tools (30)

Category Tools
Shell bash
Files file_read, file_write, file_edit, file_search, file_list
Web web_search, browser
Memory memory_search, memory_store, memory_recent
Process process_start, process_list, process_output, process_kill
Agent delegate, coding_agent
Todo todo_add, todo_list, todo_done, todo_remove
Skills skill_list, skill_show, skill_delete
Vision vision_describe
MCP mcp_connect, mcp_list_tools, mcp_call, mcp_disconnect
Gateway gateway_send, gateway_list
Checkpoint checkpoint_save, checkpoint_restore, checkpoint_list

MCP Server

Expose Caveman's memory to any MCP-compatible agent (Claude Code, Codex, Gemini CLI):

{
  "caveman": {
    "command": "caveman",
    "args": ["mcp", "serve"]
  }
}

Tools exposed: memory_store, memory_search, memory_recall, shield_save, shield_load, reflect, skill_list, skill_get, wiki_search, wiki_context.

Self-Improvement Flywheel

Caveman can audit and fix its own code:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           FLYWHEEL LOOP                 โ”‚
โ”‚                                         โ”‚
โ”‚  Discover subsystems                    โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Audit each (P0/P1/P2 findings)         โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Fix P0 + P1 issues                     โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Run tests (must pass)                  โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Commit                                 โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Record stats โ†’ learn from patterns     โ”‚
โ”‚       โ†“                                 โ”‚
โ”‚  Next round                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Run it: caveman flywheel --rounds 3 --target memory

Parallel mode: caveman flywheel --parallel --targets memory,engines,tools

Standing on Giants

Caveman incorporates battle-tested patterns from open-source projects:

  • Hermes (MIT) โ€” Compression, retrieval, error classification, credential management
  • OpenClaw (MIT) โ€” Compaction safeguards, identifier preservation
  • Karpathy's LLM Wiki โ€” Wiki compilation pattern
  • Memento-Skills โ€” Reflect-Write skill evolution

Stats

  • 159 Python files (core)
  • 24,600+ lines of code
  • 1,253 tests (unit + integration)
  • 20 subsystems
  • 30 tools
  • 6 cognitive engines
  • 170 commits
  • Self-audited through 94 rounds

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caveman_agent-0.3.0.tar.gz (455.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

caveman_agent-0.3.0-py3-none-any.whl (398.1 kB view details)

Uploaded Python 3

File details

Details for the file caveman_agent-0.3.0.tar.gz.

File metadata

  • Download URL: caveman_agent-0.3.0.tar.gz
  • Upload date:
  • Size: 455.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for caveman_agent-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c867289621e6423ab70fee8501e3ffa00be7c81ae662b4e6bb0f73f705e94bec
MD5 d7a8fcac5cc8c5c2dee53f1d79302f14
BLAKE2b-256 aca1df9fbc10b02d5d37ce60455c9b86fcec9d0d9dd7d4dc0fea9d01bb827356

See more details on using hashes here.

File details

Details for the file caveman_agent-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: caveman_agent-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 398.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for caveman_agent-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d5e1ab3c872bc716bfa4cc3b611e93f2fba735ca43b4f937d5f4b0995375d551
MD5 63ef643cb8f5af738d66fa50931b2515
BLAKE2b-256 03dd0e8dd440cde2d0670ad672c67a0c5e80c8374840aa4ef6d6ec76bcbdd7a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page