Skip to main content

Fast, zero-LLM memory for AI agents. One database. Nothing to forget.

Project description

Unforget

Unforget

Zero-LLM memory for AI agents. One database. Nothing to forget.

PyPI Python License Docs

Documentation · Quick Start · Integrations · How It Works


Most agent memory solutions require an LLM call on every write, adding 500ms+ latency and API costs. Unforget stores memories in ~7ms with zero LLM calls, retrieves them with 4-channel hybrid search, and runs on a single PostgreSQL database you already know how to operate.

pip install unforget

Why Unforget?

Unforget Others (Mem0, Zep, etc.)
Write latency ~7ms 500ms+ (LLM required)
Write cost $0 LLM API cost per write
Retrieval 4-channel hybrid (semantic + BM25 + entity + temporal) Vector-only or vector + graph
Infrastructure PostgreSQL only PostgreSQL + Neo4j / Qdrant / etc.
LLM dependency None on write path Required on every operation

Quick Start

from unforget import MemoryStore

store = MemoryStore("postgresql://user:pass@localhost/db")
await store.initialize()

# Bind once — no more repeating org_id and agent_id
memory = store.bind(org_id="acme", agent_id="support-bot")

# Write — instant, no LLM calls
await memory.write("User prefers dark mode")
await memory.write("Last order was #4821, shipped March 20",
    memory_type="event", tags=["orders"], importance=0.8)

# Recall — 4-channel retrieval + cross-encoder reranking
results = await memory.recall("what did the user order?")
# → [MemoryResult(content="Last order was #4821, shipped March 20", score=0.94)]

# Auto-recall — formatted context ready to inject into a system prompt
context = await memory.auto_recall("help with their order")
# → "[Memory Context]\n- Last order was #4821, shipped March 20\n- User prefers dark mode"

LLM Integrations

Wrap your existing OpenAI or Anthropic client. Memory becomes transparent — your agent gets recall, tools, and ingestion without changing your code.

OpenAI

pip install unforget[openai]
from openai import AsyncOpenAI
from unforget import MemoryStore
from unforget.integrations.openai import wrap_openai

store = MemoryStore("postgresql://user:pass@localhost/db")
await store.initialize()

client = wrap_openai(AsyncOpenAI(), store, org_id="acme", agent_id="support-bot")

# That's it. Memory is handled automatically:
# - Relevant memories injected into system prompt
# - Agent can call memory_store, memory_search, etc.
# - Conversation saved after each response
response = await client.chat.completions.create(
    messages=[{"role": "user", "content": "What was my last order?"}],
    model="gpt-4o",
)

Anthropic

pip install unforget[anthropic]
from anthropic import AsyncAnthropic
from unforget.integrations.anthropic import wrap_anthropic

client = wrap_anthropic(AsyncAnthropic(), store, org_id="acme", agent_id="support-bot")

response = await client.messages.create(
    messages=[{"role": "user", "content": "What was my last order?"}],
    model="claude-sonnet-4-6",
    max_tokens=1024,
)

What the wrappers do

  1. Auto-recall — searches memory before each LLM call, injects relevant context
  2. Memory tools — 5 tools the agent can call: memory_store, memory_search, memory_list, memory_update, memory_forget
  3. Tool execution — handles the tool call loop automatically (up to 5 rounds)
  4. Auto-ingest — stores the conversation in background after the response

Everything is configurable:

client = wrap_openai(
    AsyncOpenAI(), store, org_id="acme", agent_id="support-bot",
    auto_recall=True,         # inject memory context into system prompt
    auto_ingest=True,         # store conversation after response
    inject_tools=True,        # add memory tools to every call
    inject_instructions=True, # add memory usage instructions
    tools=["memory_store", "memory_search"],  # pick which tools to enable
)

Framework-Agnostic

Use MemoryToolExecutor directly with LangChain, CrewAI, or any framework:

from unforget import MemoryToolExecutor

executor = MemoryToolExecutor(store, org_id="acme", agent_id="support-bot")

# Get tool schemas in your provider's format
tools = executor.to_openai()      # OpenAI format
tools = executor.to_anthropic()   # Anthropic format
tools = executor.to_generic()     # Raw dict format

# Execute a tool call from any LLM response
result = await executor.execute("memory_store", {
    "content": "User's favorite color is blue",
    "memory_type": "insight",
    "importance": 0.7,
})

How Retrieval Works

Every recall() fires four search channels in parallel inside a single SQL query, then fuses results with Reciprocal Rank Fusion:

Channel What it does Index
Semantic pgvector cosine similarity HNSW
BM25 PostgreSQL full-text search GIN (tsvector)
Entity Named entity overlap GIN (array)
Temporal Recently accessed memories first B-tree

Results are boosted by type (insight > event > raw) and reranked with a cross-encoder for accuracy.

One SQL round trip. Four search strategies. No external services.


More Examples

Personal assistant that remembers preferences
memory = store.bind(org_id="user-123", agent_id="scheduler")

# First conversation
await memory.write("Likes morning meetings, never after 3pm")
await memory.write("Allergic to shellfish")
await memory.write("Prefers window seats on flights")

# Weeks later...
context = await memory.auto_recall("book a dinner reservation")
# → recalls the shellfish allergy automatically
Customer support bot with history
support = store.bind(org_id="acme", agent_id="support")

# Ingest a full conversation transcript
await support.ingest([
    {"role": "user", "content": "My printer isn't connecting to wifi"},
    {"role": "assistant", "content": "Let's try resetting the network settings..."},
    {"role": "user", "content": "That worked! Thanks."},
], mode="background")

# Next time the user calls about printers:
results = await support.recall("printer wifi issue")
# → recalls the previous fix
Fact versioning — when things change
memory = store.bind(org_id="u-1", agent_id="bot")

# User moves to a new city
m = await memory.write("User lives in Austin, TX")

# Six months later, they move
old, new = await memory.supersede(m.id, "User lives in Denver, CO")
# Old memory soft-deleted, new one linked. Full audit trail.

# "Where did they live in January?"
memories = await memory.timeline(at=january_15)
# → [MemoryItem(content="User lives in Austin, TX")]
Background consolidation
from unforget import ConsolidationScheduler

# Run consolidation every hour
scheduler = ConsolidationScheduler(store, interval_seconds=3600)
store.attach_scheduler(scheduler)
await scheduler.start()

# Or trigger manually
memory = store.bind(org_id="acme", agent_id="bot")
report = await memory.consolidate()
# → ConsolidationReport(duplicates_merged=3, memories_decayed=12, memories_expired=5)

Consolidation handles:

  • Dedup — merges near-identical memories (cosine > 0.92)
  • Decay — reduces importance of memories not accessed in 7/30 days
  • Expire — soft-deletes raw chunks past their 30-day TTL
  • Promote — distills raw conversation chunks into insights (with optional LLM)
REST API — 17 endpoints out of the box
from unforget.api import create_memory_router

app.include_router(create_memory_router(store), prefix="/v1/memory")
# write, recall, auto-recall, ingest, list, get, update, delete,
# bulk-delete, supersede, timeline, chain, history, stats, consolidate
Pluggable embedders
from unforget import MemoryStore, OpenAIEmbedder

# Default: local ONNX/PyTorch (free, ~3ms per embed)
store = MemoryStore("postgresql://...")

# OpenAI: higher quality, costs money
store = MemoryStore("postgresql://...", embedder=OpenAIEmbedder())

# Custom: implement BaseEmbedder
from unforget import BaseEmbedder

class CohereEmbedder(BaseEmbedder):
    @property
    def dims(self) -> int:
        return 1024

    def embed(self, text: str) -> list[float]:
        return cohere_client.embed([text]).embeddings[0]

    def embed_batch(self, texts: list[str]) -> list[list[float]]:
        return cohere_client.embed(texts).embeddings

store = MemoryStore("postgresql://...", embedder=CohereEmbedder())

Performance

Operation Latency Notes
write() ~7ms ONNX embed + single SQL insert
recall() ~25ms Embed + 4-channel CTE + rerank
auto_recall() ~25ms recall + format for system prompt
write_batch(20) ~65ms Batch embedding + batch insert
Cache hit (recall) <0.1ms TTL cache for repeated queries
Cache hit (embed) <0.1ms LRU cache by content hash

Infrastructure

# Start PostgreSQL + pgvector
docker compose up -d

# Install
pip install unforget              # core only
pip install unforget[openai]      # + OpenAI SDK wrapper
pip install unforget[anthropic]   # + Anthropic SDK wrapper
pip install unforget[api]         # + FastAPI router
pip install unforget[spacy]       # + better entity extraction
  • Database: PostgreSQL 16 + pgvector
  • Embedding: all-MiniLM-L6-v2 via ONNX Runtime (default) or PyTorch
  • Reranking: ms-marco-MiniLM-L-6-v2 cross-encoder
  • Python: 3.11+
  • No external APIs required for core functionality

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unforget-0.3.0.tar.gz (71.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unforget-0.3.0-py3-none-any.whl (60.9 kB view details)

Uploaded Python 3

File details

Details for the file unforget-0.3.0.tar.gz.

File metadata

  • Download URL: unforget-0.3.0.tar.gz
  • Upload date:
  • Size: 71.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for unforget-0.3.0.tar.gz
Algorithm Hash digest
SHA256 14ad4ec6b2a0c75381eba6e44b9e020a567933c2a769f854f42c62802ad7bdb4
MD5 f16551491ae3594003183b279255ce14
BLAKE2b-256 fbe51ff813947bd23e8a771d727163594782496bf45946b0c65ca019b34ba030

See more details on using hashes here.

Provenance

The following attestation bundles were made for unforget-0.3.0.tar.gz:

Publisher: publish.yml on unforget-ai/unforget

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unforget-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: unforget-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 60.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for unforget-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdf6573e3fca251f644aaebd07279f9ced4d9eb877b5746bd497e3c9be7a3bd2
MD5 9994259f3a37493392b64bb30c86c3df
BLAKE2b-256 50382c06263734849e29a76da9882ed2175e8d51fce7b18cde0e07243c7b7b78

See more details on using hashes here.

Provenance

The following attestation bundles were made for unforget-0.3.0-py3-none-any.whl:

Publisher: publish.yml on unforget-ai/unforget

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page