Open-source, self-hostable memory engine for AI agents using a three-layer sentence graph architecture.
Project description
Vektori
Memory that remembers the story, not just the facts.
Most memory systems compress conversations into entity-relationship triples. You get the answer, but you lose the texture, the reasoning, the trajectory. Vektori uses a three-layer sentence graph so agents don't just recall preferences, they understand how things got there.
FACT LAYER (L0) <- vector search surface. Short, crisp statements.
|
EPISODE LAYER (L1) <- patterns auto-discovered via graph traversal.
|
SENTENCE LAYER (L2) <- raw conversation. Sequential NEXT edges. The full story.
<<<<<<< HEAD
Search hits Facts, graph discovers Episodes, traces back to source Sentences. One database, Postgres or SQLite. No Neo4j, no Qdrant, no infra drama.
Search hits Facts, graph discovers Episodes, traces back to source Sentences. SQLite by default — swap to Postgres, Neo4j, or Qdrant when you're ready to scale.
abc1fdde5ff85f05c6d6f7ece73cdf5262765875
Benchmarks
| Benchmark | Score | Depth | Models |
|---|---|---|---|
| LongMemEval-S | 73% | L1 | BGE-M3 + Gemini Flash |
Still improving. Run your own in /benchmarks.
Install
pip install vektori # SQLite + Postgres
pip install 'vektori[neo4j]' # + Neo4j support
pip install 'vektori[qdrant]' # + Qdrant support
pip install 'vektori[neo4j,qdrant]' # all backends
No Docker, no external services. SQLite by default.
30-Second Quickstart
import asyncio
from vektori import Vektori
async def main():
v = Vektori(
embedding_model="openai:text-embedding-3-small",
extraction_model="openai:gpt-4o-mini",
)
await v.add(
messages=[
{"role": "user", "content": "I only use WhatsApp, please don't email me."},
{"role": "assistant", "content": "Got it, WhatsApp only."},
{"role": "user", "content": "My outstanding amount is ₹45,000 and I can pay by Friday."},
],
session_id="call-001",
user_id="user-123",
)
results = await v.search(
query="How does this user prefer to communicate?",
user_id="user-123",
depth="l1", # facts + episodes
)
for fact in results["facts"]:
print(f"[{fact['score']:.2f}] {fact['text']}")
for episode in results["episodes"]:
print(f"episode: {episode['text']}")
await v.close()
asyncio.run(main())
Output:
[0.94] User prefers WhatsApp communication
[0.81] Outstanding balance of ₹45,000, payment expected Friday
episode: User consistently avoids email — route all comms to WhatsApp
Retrieval Depths
Pick how deep you want to go.
| Depth | Returns | ~Tokens | When to use |
|---|---|---|---|
l0 |
Facts only | 50-200 | Fast lookup, agent planning, tool calls |
l1 |
Facts + Episodes | 200-500 | Default. Full answer with context |
l2 |
Facts + Episodes + raw Sentences | 1000-3000 | Trajectory analysis, full story replay |
# Just the facts
results = await v.search(query, user_id, depth="l0")
# Facts + episodes (recommended)
results = await v.search(query, user_id, depth="l1")
# Everything, with surrounding conversation context
results = await v.search(query, user_id, depth="l2", context_window=3)
Build an Agent with Memory
Three lines to wire memory into any agent loop:
import asyncio
from openai import AsyncOpenAI
from vektori import Vektori
client = AsyncOpenAI()
async def chat(user_id: str):
v = Vektori(
embedding_model="openai:text-embedding-3-small",
extraction_model="openai:gpt-4o-mini",
)
session_id = f"session-{user_id}-001"
history = []
print("Chat with memory (type 'quit' to exit)\n")
while True:
user_input = input("You: ").strip()
if user_input.lower() == "quit":
break
# 1. Pull relevant memory
mem = await v.search(query=user_input, user_id=user_id, depth="l1")
facts = "\n".join(f"- {f['text']}" for f in mem.get("facts", []))
episodes = "\n".join(f"- {ep['text']}" for ep in mem.get("episodes", []))
# 2. Inject into system prompt
system = "You are a helpful assistant with memory.\n"
if facts: system += f"\nKnown facts:\n{facts}"
if episodes: system += f"\nBehavioral episodes:\n{episodes}"
# 3. Get response
history.append({"role": "user", "content": user_input})
resp = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "system", "content": system}, *history],
)
reply = resp.choices[0].message.content
history.append({"role": "assistant", "content": reply})
print(f"Assistant: {reply}\n")
# 4. Store exchange
await v.add(
messages=[{"role": "user", "content": user_input},
{"role": "assistant", "content": reply}],
session_id=session_id,
user_id=user_id,
)
await v.close()
asyncio.run(chat("demo-user"))
More examples in /examples:
quickstart.py— fully local, zero API keys (Ollama)openai_agent.py— OpenAI agent loop
Storage Backends
# SQLite (default) — zero config, starts instantly
v = Vektori()
# PostgreSQL + pgvector — production scale
v = Vektori(database_url="postgresql://localhost:5432/vektori")
# Neo4j — native graph traversal for Episode layer
v = Vektori(
storage_backend="neo4j",
database_url="bolt://localhost:7687",
embedding_dimension=1024, # must match your embedding model
)
# Qdrant — dedicated vector DB, cloud-ready
v = Vektori(
storage_backend="qdrant",
database_url="http://localhost:6333",
embedding_dimension=1024,
)
# Qdrant Cloud
v = Vektori(
storage_backend="qdrant",
database_url="https://your-cluster.qdrant.io",
qdrant_api_key="your-api-key",
embedding_dimension=1024,
)
# In-memory — tests / CI
v = Vektori(storage_backend="memory")
All backends via Docker:
git clone https://github.com/vektori-ai/vektori
cd vektori
docker compose up -d # starts Postgres, Neo4j, and Qdrant
# Postgres
DATABASE_URL=postgresql://vektori:vektori@localhost:5432/vektori python examples/quickstart_postgres.py
# Neo4j
VEKTORI_STORAGE_BACKEND=neo4j VEKTORI_DATABASE_URL=bolt://localhost:7687 vektori add "I prefer dark mode" --user-id u1
# Qdrant
VEKTORI_STORAGE_BACKEND=qdrant VEKTORI_DATABASE_URL=http://localhost:6333 vektori add "I prefer dark mode" --user-id u1
CLI storage flags:
vektori config --storage-backend qdrant --database-url http://localhost:6333
vektori add "my note" --user-id u1
vektori search "preferences" --user-id u1
Model Support
Bring whatever model stack you have. Works with 10 providers out of the box.
# OpenAI
v = Vektori(
embedding_model="openai:text-embedding-3-small",
extraction_model="openai:gpt-4o-mini",
)
# Azure OpenAI
# Ensure AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY are set
# Note: The string after "azure:" must match your specific Azure deployment names
v = Vektori(
embedding_model="azure:my-embedding-deployment",
extraction_model="azure:my-gpt-4o-deployment",
)
# GitHub Models (Copilot)
# Requires GITHUB_TOKEN. You can get one by running `./scripts/get_github_token.sh`
v = Vektori(
embedding_model="github:text-embedding-3-small",
extraction_model="github:gpt-4o",
)
# Anthropic
v = Vektori(
embedding_model="anthropic:voyage-3",
extraction_model="anthropic:claude-haiku-4-5-20251001",
)
# Fully local, no API keys, no internet
v = Vektori(
embedding_model="ollama:nomic-embed-text",
extraction_model="ollama:llama3",
)
# Sentence Transformers (local, no Ollama required)
v = Vektori(embedding_model="sentence-transformers:all-MiniLM-L6-v2")
# BGE-M3 — multilingual, 1024-dim, best local embeddings we've found
v = Vektori(embedding_model="bge:BAAI/bge-m3")
# LiteLLM — 100+ providers through one interface
v = Vektori(extraction_model="litellm:groq/llama3-8b-8192")
Why Not Mem0 / Zep?
| Mem0 / Zep | Vektori | |
|---|---|---|
| Memory model | Entity-relation triples | Three-layer sentence graph |
| What you get | The answer | The answer + reasoning + story |
| Patterns beyond facts | Manual graph queries | Auto-discovered (Episode layer) |
| Default backend | Requires external DB | SQLite, zero config |
| Fully local / offline | No | Yes (Ollama, BGE-M3, SentenceTransformers) |
| License | Partial OSS | Apache 2.0 |
Mem0 and Zep are solid tools. But they compress conversations into triples, so you get the what but not the why or how it changed over time. That matters when you're building agents that need to reason about a person's trajectory, not just their current state.
Contributing
Issues, PRs, and ideas welcome. See CONTRIBUTING.md.
git clone https://github.com/vektori-ai/vektori
cd vektori
pip install -e ".[dev]"
pytest tests/unit/
License
Apache 2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vektori-0.1.2.dev20.tar.gz.
File metadata
- Download URL: vektori-0.1.2.dev20.tar.gz
- Upload date:
- Size: 270.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a07756ec1d70505cc92d771c6925b4aacad16c580ca0aa0031bec682d35626ac
|
|
| MD5 |
d4f5984c4065a5c083d3b601f592ca1c
|
|
| BLAKE2b-256 |
4883e9b9bac14adce2a50bbd858584ed9fc9e88a0d33b440cac2dbdef00646b0
|
Provenance
The following attestation bundles were made for vektori-0.1.2.dev20.tar.gz:
Publisher:
publish.yml on vektori-ai/vektori
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vektori-0.1.2.dev20.tar.gz -
Subject digest:
a07756ec1d70505cc92d771c6925b4aacad16c580ca0aa0031bec682d35626ac - Sigstore transparency entry: 1278925336
- Sigstore integration time:
-
Permalink:
vektori-ai/vektori@627f9c3aa70e6cffde59f21f9b5c53fac9d99441 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vektori-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@627f9c3aa70e6cffde59f21f9b5c53fac9d99441 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vektori-0.1.2.dev20-py3-none-any.whl.
File metadata
- Download URL: vektori-0.1.2.dev20-py3-none-any.whl
- Upload date:
- Size: 107.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
640090cbb7fc1a62c176c58b19223ec3e29a913fe3db4a579e74c480eb74b60a
|
|
| MD5 |
195b877178dc8711bc0f6b8c6acef458
|
|
| BLAKE2b-256 |
840a3bfa459d0e1e00a984463f19de5386798a793c0247b6092ec8573c6b5e62
|
Provenance
The following attestation bundles were made for vektori-0.1.2.dev20-py3-none-any.whl:
Publisher:
publish.yml on vektori-ai/vektori
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vektori-0.1.2.dev20-py3-none-any.whl -
Subject digest:
640090cbb7fc1a62c176c58b19223ec3e29a913fe3db4a579e74c480eb74b60a - Sigstore transparency entry: 1278925340
- Sigstore integration time:
-
Permalink:
vektori-ai/vektori@627f9c3aa70e6cffde59f21f9b5c53fac9d99441 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vektori-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@627f9c3aa70e6cffde59f21f9b5c53fac9d99441 -
Trigger Event:
push
-
Statement type: