Fast, zero-LLM memory for AI agents. One database. Nothing to forget.
Project description
Unforget
Zero-LLM memory for AI agents. One database. Nothing to forget.
Documentation · Quick Start · Integrations · How It Works
Most agent memory solutions require an LLM call on every write, adding 500ms+ latency and API costs. Unforget stores memories in ~7ms with zero LLM calls, retrieves them with 4-channel hybrid search, and runs on a single PostgreSQL database you already know how to operate.
pip install unforget
Why Unforget?
| Unforget | Others (Mem0, Zep, etc.) | |
|---|---|---|
| Write latency | ~7ms | 500ms+ (LLM required) |
| Write cost | $0 | LLM API cost per write |
| Retrieval | 4-channel hybrid (semantic + BM25 + entity + temporal) | Vector-only or vector + graph |
| Infrastructure | PostgreSQL only | PostgreSQL + Neo4j / Qdrant / etc. |
| LLM dependency | None on write path | Required on every operation |
Quick Start
from unforget import MemoryStore
store = MemoryStore("postgresql://user:pass@localhost/db")
await store.initialize()
# Bind once — no more repeating org_id and agent_id
memory = store.bind(org_id="acme", agent_id="support-bot")
# Write — instant, no LLM calls
await memory.write("User prefers dark mode")
await memory.write("Last order was #4821, shipped March 20",
memory_type="event", tags=["orders"], importance=0.8)
# Recall — 4-channel retrieval + cross-encoder reranking
results = await memory.recall("what did the user order?")
# → [MemoryResult(content="Last order was #4821, shipped March 20", score=0.94)]
# Auto-recall — formatted context ready to inject into a system prompt
context = await memory.auto_recall("help with their order")
# → "[Memory Context]\n- Last order was #4821, shipped March 20\n- User prefers dark mode"
LLM Integrations
Wrap your existing OpenAI or Anthropic client. Memory becomes transparent — your agent gets recall, tools, and ingestion without changing your code.
OpenAI
pip install unforget[openai]
from openai import AsyncOpenAI
from unforget import MemoryStore
from unforget.integrations.openai import wrap_openai
store = MemoryStore("postgresql://user:pass@localhost/db")
await store.initialize()
client = wrap_openai(AsyncOpenAI(), store, org_id="acme", agent_id="support-bot")
# That's it. Memory is handled automatically:
# - Relevant memories injected into system prompt
# - Agent can call memory_store, memory_search, etc.
# - Conversation saved after each response
response = await client.chat.completions.create(
messages=[{"role": "user", "content": "What was my last order?"}],
model="gpt-4o",
)
Anthropic
pip install unforget[anthropic]
from anthropic import AsyncAnthropic
from unforget.integrations.anthropic import wrap_anthropic
client = wrap_anthropic(AsyncAnthropic(), store, org_id="acme", agent_id="support-bot")
response = await client.messages.create(
messages=[{"role": "user", "content": "What was my last order?"}],
model="claude-sonnet-4-6",
max_tokens=1024,
)
What the wrappers do
- Auto-recall — searches memory before each LLM call, injects relevant context
- Memory tools — 5 tools the agent can call:
memory_store,memory_search,memory_list,memory_update,memory_forget - Tool execution — handles the tool call loop automatically (up to 5 rounds)
- Auto-ingest — stores the conversation in background after the response
Everything is configurable:
client = wrap_openai(
AsyncOpenAI(), store, org_id="acme", agent_id="support-bot",
auto_recall=True, # inject memory context into system prompt
auto_ingest=True, # store conversation after response
inject_tools=True, # add memory tools to every call
inject_instructions=True, # add memory usage instructions
tools=["memory_store", "memory_search"], # pick which tools to enable
)
Framework-Agnostic
Use MemoryToolExecutor directly with LangChain, CrewAI, or any framework:
from unforget import MemoryToolExecutor
executor = MemoryToolExecutor(store, org_id="acme", agent_id="support-bot")
# Get tool schemas in your provider's format
tools = executor.to_openai() # OpenAI format
tools = executor.to_anthropic() # Anthropic format
tools = executor.to_generic() # Raw dict format
# Execute a tool call from any LLM response
result = await executor.execute("memory_store", {
"content": "User's favorite color is blue",
"memory_type": "insight",
"importance": 0.7,
})
How Retrieval Works
Every recall() fires four search channels in parallel inside a single SQL query, then fuses results with Reciprocal Rank Fusion:
| Channel | What it does | Index |
|---|---|---|
| Semantic | pgvector cosine similarity | HNSW |
| BM25 | PostgreSQL full-text search | GIN (tsvector) |
| Entity | Named entity overlap | GIN (array) |
| Temporal | Recently accessed memories first | B-tree |
Results are boosted by type (insight > event > raw) and reranked with a cross-encoder for accuracy.
One SQL round trip. Four search strategies. No external services.
More Examples
Personal assistant that remembers preferences
memory = store.bind(org_id="user-123", agent_id="scheduler")
# First conversation
await memory.write("Likes morning meetings, never after 3pm")
await memory.write("Allergic to shellfish")
await memory.write("Prefers window seats on flights")
# Weeks later...
context = await memory.auto_recall("book a dinner reservation")
# → recalls the shellfish allergy automatically
Customer support bot with history
support = store.bind(org_id="acme", agent_id="support")
# Ingest a full conversation transcript
await support.ingest([
{"role": "user", "content": "My printer isn't connecting to wifi"},
{"role": "assistant", "content": "Let's try resetting the network settings..."},
{"role": "user", "content": "That worked! Thanks."},
], mode="background")
# Next time the user calls about printers:
results = await support.recall("printer wifi issue")
# → recalls the previous fix
Fact versioning — when things change
memory = store.bind(org_id="u-1", agent_id="bot")
# User moves to a new city
m = await memory.write("User lives in Austin, TX")
# Six months later, they move
old, new = await memory.supersede(m.id, "User lives in Denver, CO")
# Old memory soft-deleted, new one linked. Full audit trail.
# "Where did they live in January?"
memories = await memory.timeline(at=january_15)
# → [MemoryItem(content="User lives in Austin, TX")]
Background consolidation
from unforget import ConsolidationScheduler
# Run consolidation every hour
scheduler = ConsolidationScheduler(store, interval_seconds=3600)
store.attach_scheduler(scheduler)
await scheduler.start()
# Or trigger manually
memory = store.bind(org_id="acme", agent_id="bot")
report = await memory.consolidate()
# → ConsolidationReport(duplicates_merged=3, memories_decayed=12, memories_expired=5)
Consolidation handles:
- Dedup — merges near-identical memories (cosine > 0.92)
- Decay — reduces importance of memories not accessed in 7/30 days
- Expire — soft-deletes raw chunks past their 30-day TTL
- Promote — distills raw conversation chunks into insights (with optional LLM)
REST API — 17 endpoints out of the box
from unforget.api import create_memory_router
app.include_router(create_memory_router(store), prefix="/v1/memory")
# write, recall, auto-recall, ingest, list, get, update, delete,
# bulk-delete, supersede, timeline, chain, history, stats, consolidate
Pluggable embedders
from unforget import MemoryStore, OpenAIEmbedder
# Default: local ONNX/PyTorch (free, ~3ms per embed)
store = MemoryStore("postgresql://...")
# OpenAI: higher quality, costs money
store = MemoryStore("postgresql://...", embedder=OpenAIEmbedder())
# Custom: implement BaseEmbedder
from unforget import BaseEmbedder
class CohereEmbedder(BaseEmbedder):
@property
def dims(self) -> int:
return 1024
def embed(self, text: str) -> list[float]:
return cohere_client.embed([text]).embeddings[0]
def embed_batch(self, texts: list[str]) -> list[list[float]]:
return cohere_client.embed(texts).embeddings
store = MemoryStore("postgresql://...", embedder=CohereEmbedder())
Performance
| Operation | Latency | Notes |
|---|---|---|
write() |
~7ms | ONNX embed + single SQL insert |
recall() |
~25ms | Embed + 4-channel CTE + rerank |
auto_recall() |
~25ms | recall + format for system prompt |
write_batch(20) |
~65ms | Batch embedding + batch insert |
| Cache hit (recall) | <0.1ms | TTL cache for repeated queries |
| Cache hit (embed) | <0.1ms | LRU cache by content hash |
Infrastructure
# Start PostgreSQL + pgvector
docker compose up -d
# Install
pip install unforget # core only
pip install unforget[openai] # + OpenAI SDK wrapper
pip install unforget[anthropic] # + Anthropic SDK wrapper
pip install unforget[api] # + FastAPI router
pip install unforget[spacy] # + better entity extraction
- Database: PostgreSQL 16 + pgvector
- Embedding:
all-MiniLM-L6-v2via ONNX Runtime (default) or PyTorch - Reranking:
ms-marco-MiniLM-L-6-v2cross-encoder - Python: 3.11+
- No external APIs required for core functionality
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unforget-0.3.0.tar.gz.
File metadata
- Download URL: unforget-0.3.0.tar.gz
- Upload date:
- Size: 71.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14ad4ec6b2a0c75381eba6e44b9e020a567933c2a769f854f42c62802ad7bdb4
|
|
| MD5 |
f16551491ae3594003183b279255ce14
|
|
| BLAKE2b-256 |
fbe51ff813947bd23e8a771d727163594782496bf45946b0c65ca019b34ba030
|
Provenance
The following attestation bundles were made for unforget-0.3.0.tar.gz:
Publisher:
publish.yml on unforget-ai/unforget
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
unforget-0.3.0.tar.gz -
Subject digest:
14ad4ec6b2a0c75381eba6e44b9e020a567933c2a769f854f42c62802ad7bdb4 - Sigstore transparency entry: 1192040692
- Sigstore integration time:
-
Permalink:
unforget-ai/unforget@895db278022115bdda974c8709585ea538131028 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/unforget-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@895db278022115bdda974c8709585ea538131028 -
Trigger Event:
release
-
Statement type:
File details
Details for the file unforget-0.3.0-py3-none-any.whl.
File metadata
- Download URL: unforget-0.3.0-py3-none-any.whl
- Upload date:
- Size: 60.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdf6573e3fca251f644aaebd07279f9ced4d9eb877b5746bd497e3c9be7a3bd2
|
|
| MD5 |
9994259f3a37493392b64bb30c86c3df
|
|
| BLAKE2b-256 |
50382c06263734849e29a76da9882ed2175e8d51fce7b18cde0e07243c7b7b78
|
Provenance
The following attestation bundles were made for unforget-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on unforget-ai/unforget
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
unforget-0.3.0-py3-none-any.whl -
Subject digest:
fdf6573e3fca251f644aaebd07279f9ced4d9eb877b5746bd497e3c9be7a3bd2 - Sigstore transparency entry: 1192040693
- Sigstore integration time:
-
Permalink:
unforget-ai/unforget@895db278022115bdda974c8709585ea538131028 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/unforget-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@895db278022115bdda974c8709585ea538131028 -
Trigger Event:
release
-
Statement type: