Persistent identity and memory for any LLM agent โ markdown-native, provider-agnostic
Project description
soul.py ๐ง
Your AI forgets everything when the conversation ends. soul.py fixes that.
๐ NEW: The book is out! Soul: Building AI Agents That Remember Who They Are โ everything here + deep dives on identity, memory patterns, multi-agent coordination, and the philosophy of persistent AI. Get it on Amazon โ
from hybrid_agent import HybridAgent
agent = HybridAgent()
agent.ask("My name is Prahlad and I'm building an AI research lab.")
# New process. New session. Memory persists.
agent = HybridAgent()
result = agent.ask("What do you know about me?")
print(result["answer"])
# โ "You're Prahlad, building an AI research lab."
No database. No server. Just markdown files and smart retrieval.
โถ Live Demos
| Version | Demo | What it shows |
|---|---|---|
| v0.1 | soul.themenonlab.com | Memory persists across sessions |
| v1.0 | soulv1.themenonlab.com | Semantic RAG retrieval |
| v2.0 | soulv2.themenonlab.com | Auto query routing: RAG + RLM |
| v0.2.0 | โ | Modulizer: 50% token savings, zero-deps |
| Ask Darwin | soul-book.themenonlab.com | ๐ Book companion โ watch routing decisions live |
๐ The Book
Soul: Building AI Agents That Remember Who They Are
The complete guide to persistent AI memory. Covers:
- Why agents forget (and the architectural fix)
- Identity vs Memory (SOUL.md vs MEMORY.md)
- RAG vs RLM (when to use each)
- Multi-agent memory sharing
- Darwinian evolution of agent identity
- Working code in every chapter
Install
pip install soul-agent
pip install soul-agent[anthropic]
pip install soul-agent[openai]
pip install soul-agent[gemini] # โ
Now available!
๐ v0.2.0 โ Modulizer (50% Token Savings)
Large MEMORY.md files burn tokens. Modulizer splits them into indexed modules and retrieves only what's relevant.
# Split your memory into modules
soul modulize MEMORY.md
# Creates:
# modules/INDEX.md (1.7KB)
# modules/projects.md
# modules/tools.md
# ...
Two-phase retrieval:
- Read INDEX.md (always small)
- LLM picks relevant modules
- Load only those modules
Results: 47% fewer tokens on 25KB MEMORY.md. Zero infrastructure โ no vector DB, no embeddings.
from soul import Agent
agent = Agent(use_modules=True) # default when modules exist
response = agent.ask("What tools have I used?")
# Check what was loaded
stats = agent.get_memory_stats()
# {'mode': 'modules', 'modules_read': ['tools.md'], 'total_kb': 5.5}
CLI commands:
soul modulize <file>โ split into modulessoul modules listโ view modulessoul chat --no-modulesโ disable (opt-out)
Quickstart
soul init # creates SOUL.md and MEMORY.md
# v0.1 โ simple markdown memory (great starting point)
from soul import Agent
agent = Agent(provider="anthropic")
agent.ask("Remember this.")
# v2.0 โ automatic RAG + RLM routing (this repo's default)
from hybrid_agent import HybridAgent
agent = HybridAgent() # auto-detects best retrieval per query
result = agent.ask("What do you know about me?")
print(result["answer"])
print(result["route"]) # "RAG" or "RLM"
Multi-Provider Support
soul.py works with any LLM provider โ no SDK lock-in:
# Anthropic (default)
agent = HybridAgent(provider="anthropic") # Uses ANTHROPIC_API_KEY
# Google Gemini
agent = HybridAgent(
provider="gemini",
chat_model="gemini-2.5-pro", # or gemini-2.0-flash, gemini-2.5-flash
router_model="gemini-2.0-flash", # keep router cheap
) # Uses GEMINI_API_KEY
# OpenAI
agent = HybridAgent(provider="openai") # Uses OPENAI_API_KEY
# Local via Ollama
agent = HybridAgent(
provider="openai-compatible",
base_url="http://localhost:11434/v1",
chat_model="llama3.2",
)
| Provider | Default Model | Env Var |
|---|---|---|
anthropic |
claude-haiku-4-5 | ANTHROPIC_API_KEY |
gemini |
gemini-2.0-flash | GEMINI_API_KEY |
openai |
gpt-4o-mini | OPENAI_API_KEY |
openai-compatible |
llama3.2 | OPENAI_API_KEY (optional) |
โ๏ธ SoulMate API โ Managed Cloud Option
Don't want to manage local files? SoulMate API gives you persistent memory as a service:
from soulmate import SoulMateClient
# Sign up at soulmate-api.themenonlab.com/docs
client = SoulMateClient(
api_key="sm_live_...",
anthropic_key="sk-ant-..." # BYOK โ your own Anthropic key
)
# That's it. Memory persists in the cloud.
response = client.ask("My name is Prahlad.")
response = client.ask("What's my name?") # โ "Prahlad"
| Local (soul.py) | Cloud (SoulMate API) |
|---|---|
| Files on your machine | Managed cloud storage |
| You control everything | Zero infrastructure |
| Git-versioned memory | API-based, instant setup |
| Free forever | Free tier available |
Get started: soulmate-api.themenonlab.com/docs
How it works
soul.py uses two markdown files as persistent state:
| File | Purpose |
|---|---|
SOUL.md |
Identity โ who the agent is, how it behaves |
MEMORY.md |
Memory โ timestamped log of every exchange |
v2.0 adds a query router that automatically dispatches to the right retrieval strategy:
Your query
โ
Router (fast LLM call)
โโโ FOCUSED (~90%) โ RAG โ vector search, sub-second
โโโ EXHAUSTIVE (~10%) โ RLM โ recursive synthesis, thorough
Architecture based on: RAG + RLM: The Complete Knowledge Base Architecture
Branches
| Branch | Description | Best for |
|---|---|---|
main |
v2.0 โ RAG + RLM hybrid (default) | Production use |
v2.0-rag-rlm |
Same as main, versioned | Pinning to v2 |
v1.0-rag |
RAG only, no RLM | Simpler setup |
v0.1-stable |
Pure markdown, zero deps | Learning / prototyping |
v2.0 API
result = agent.ask("What is my name?")
result["answer"] # the response
result["route"] # "RAG" or "RLM"
result["router_ms"] # router latency
result["retrieval_ms"] # retrieval latency
result["total_ms"] # total latency
result["rag_context"] # retrieved chunks (RAG path)
result["rlm_meta"] # chunk stats (RLM path)
v2.0 Setup
agent = HybridAgent(
soul_path="SOUL.md",
memory_path="MEMORY.md",
mode="auto", # "auto" | "rag" | "rlm"
qdrant_url="...", # or set QDRANT_URL env var
qdrant_api_key="...", # or QDRANT_API_KEY
azure_embedding_endpoint="...", # or AZURE_EMBEDDING_ENDPOINT
azure_embedding_key="...", # or AZURE_EMBEDDING_KEY
k=5, # RAG retrieval count
)
Falls back to BM25 (keyword) if Qdrant/Azure not configured.
๐ Knowledge Bases + Memory
soul.py isn't just for personal memory โ the same architecture works for custom knowledge bases. Combine both in a single agent:
agent = HybridAgent(
soul_path="SOUL.md",
memory_path="MEMORY.md", # Per-user memory
knowledge_dir="./knowledge", # Your corpus (docs, products, policies)
)
# Index your knowledge base once
agent.index_knowledge()
# Now the agent searches both pools
agent.ask("What's the return policy?") # โ Knowledge base
agent.ask("What was I asking about earlier?") # โ User memory
agent.ask("Which product fits my needs?") # โ Both
Example use cases:
| Agent Type | Knowledge Base | Memory |
|---|---|---|
| Support Bot | Product docs, policies, FAQs | Customer history, preferences |
| Research Assistant | Paper corpus, methodologies | User's focus, papers read |
| Onboarding Buddy | Company handbook, org chart | New hire's role, questions |
| Book Companion | Full book content | Reader's interests, progress |
Darwin (the AI companion for the Soul book) uses exactly this pattern โ the entire book indexed as knowledge, plus per-reader conversation memory.
See the Memory Architecture Patterns guide for detailed implementation patterns.
๐ Framework Integrations
Already using a framework? Drop in soul.py memory with one line:
| Framework | Package | Install |
|---|---|---|
| LangChain | langchain-soul | pip install langchain-soul |
| LlamaIndex | llamaindex-soul | pip install llamaindex-soul |
| CrewAI | crewai-soul | pip install crewai-soul |
# LangChain
from langchain_soul import SoulChatMessageHistory
history = SoulChatMessageHistory(session_id="user-123")
# LlamaIndex
from llamaindex_soul import SoulChatStore
chat_store = SoulChatStore()
# CrewAI
from crewai_soul import SoulMemory
memory = SoulMemory()
Each integration includes:
- soul-agent โ RAG + RLM hybrid retrieval
- soul-schema โ Database semantic layer (auto-document your tables)
- SoulMate client โ Managed cloud option
Why not LangChain / LlamaIndex / MemGPT?
Those are orchestration frameworks. soul.py is a primitive โ persistent identity and memory you can drop into anything you're building.
- No framework lock-in โ works with any LLM provider, or with your favorite framework via integrations above
- Human-readable โ SOUL.md and MEMORY.md are plain text
- Version-controllable โ git diff your agent's memories
- Composable โ use just the parts you need
Roadmap
See ROADMAP.md for planned features and how to contribute.
License
MIT
Citation
@software{menon2026soul,
author = {Menon, Prahlad G.},
title = {soul.py: Persistent Identity and Memory for LLM Agents},
year = {2026},
url = {https://github.com/menonpg/soul.py}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file soul_agent-0.2.1.tar.gz.
File metadata
- Download URL: soul_agent-0.2.1.tar.gz
- Upload date:
- Size: 36.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
283c59de3eb524085764d9d6e400f82cac68c51f95c9a5d903e06e202e5adba4
|
|
| MD5 |
7d9adebc8e099c9e0c35ec5bbe8af84c
|
|
| BLAKE2b-256 |
c6fbbd44d734cf6b74e17d8ce7bd87a0438c3cae2c448c70b80ea3030615b470
|
File details
Details for the file soul_agent-0.2.1-py3-none-any.whl.
File metadata
- Download URL: soul_agent-0.2.1-py3-none-any.whl
- Upload date:
- Size: 37.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b0016f56b1e8420ca7e1eec8c3c07c7cefb6344d5edb95c1139c0aa0aaa443a
|
|
| MD5 |
883a50ead3a3ddf00399619c40665124
|
|
| BLAKE2b-256 |
7875b34d561d6952c9d01c132d96d51faed7ad5f52a6335b1b74eb465830d6c5
|