Open-source, self-hostable memory engine for AI agents using a three-layer sentence graph architecture.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

vektori

These details have not been verified by PyPI

Project description

Vektori logo

Vektori

Memory that remembers the story, not just the facts.

Most memory systems compress conversations into entity-relationship triples. You get the answer, but you lose the texture, the reasoning, the trajectory. Vektori uses a three-layer sentence graph so agents don't just recall preferences, they understand how things got there.

FACT LAYER (L0)      <- vector search surface. Short, crisp statements.
        |
EPISODE LAYER (L1)   <- patterns auto-discovered via graph traversal.
        |
SENTENCE LAYER (L2)  <- raw conversation. Sequential NEXT edges. The full story.

Three-layer memory graph: Facts → Episodes → Sentences

Search hits Facts, graph discovers Episodes, traces back to source Sentences. SQLite by default — swap to Postgres, Neo4j, Qdrant, or Milvus when you're ready to scale.

Benchmarks

Benchmark	Score	Depth	Models
LongMemEval-S	73%	L1	BGE-M3 + Gemini Flash

Still improving. Run your own in /benchmarks.

Install

pip install vektori                      # SQLite + Postgres
pip install 'vektori[neo4j]'             # + Neo4j support
pip install 'vektori[qdrant]'            # + Qdrant support
pip install 'vektori[milvus]'            # + Milvus support
pip install 'vektori[neo4j,qdrant,milvus]'  # all backends

No Docker, no external services. SQLite by default.

30-Second Quickstart

import asyncio
from vektori import Vektori

async def main():
    v = Vektori(
        embedding_model="openai:text-embedding-3-small",
        extraction_model="openai:gpt-4o-mini",
    )

    await v.add(
        messages=[
            {"role": "user", "content": "I only use WhatsApp, please don't email me."},
            {"role": "assistant", "content": "Got it, WhatsApp only."},
            {"role": "user", "content": "My outstanding amount is ₹45,000 and I can pay by Friday."},
        ],
        session_id="call-001",
        user_id="user-123",
    )

    results = await v.search(
        query="How does this user prefer to communicate?",
        user_id="user-123",
        depth="l1",  # facts + episodes
    )

    for fact in results["facts"]:
        print(f"[{fact['score']:.2f}] {fact['text']}")
    for episode in results["episodes"]:
        print(f"episode: {episode['text']}")

    await v.close()

asyncio.run(main())

Output:

[0.94] User prefers WhatsApp communication
[0.81] Outstanding balance of ₹45,000, payment expected Friday
episode: User consistently avoids email — route all comms to WhatsApp

Retrieval Depths

Pick how deep you want to go.

Depth	Returns	~Tokens	When to use
`l0`	Facts only	50-200	Fast lookup, agent planning, tool calls
`l1`	Facts + Episodes	200-500	Default. Full answer with context
`l2`	Facts + Episodes + raw Sentences	1000-3000	Trajectory analysis, full story replay

# Just the facts
results = await v.search(query, user_id, depth="l0")

# Facts + episodes (recommended)
results = await v.search(query, user_id, depth="l1")

# Everything, with surrounding conversation context
results = await v.search(query, user_id, depth="l2", context_window=3)

Build an Agent with Memory

Three lines to wire memory into any agent loop:

import asyncio
from openai import AsyncOpenAI
from vektori import Vektori

client = AsyncOpenAI()

async def chat(user_id: str):
    v = Vektori(
        embedding_model="openai:text-embedding-3-small",
        extraction_model="openai:gpt-4o-mini",
    )
    session_id = f"session-{user_id}-001"
    history = []

    print("Chat with memory (type 'quit' to exit)\n")
    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == "quit":
            break

        # 1. Pull relevant memory
        mem = await v.search(query=user_input, user_id=user_id, depth="l1")
        facts = "\n".join(f"- {f['text']}" for f in mem.get("facts", []))
        episodes = "\n".join(f"- {ep['text']}" for ep in mem.get("episodes", []))

        # 2. Inject into system prompt
        system = "You are a helpful assistant with memory.\n"
        if facts:    system += f"\nKnown facts:\n{facts}"
        if episodes: system += f"\nBehavioral episodes:\n{episodes}"

        # 3. Get response
        history.append({"role": "user", "content": user_input})
        resp = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "system", "content": system}, *history],
        )
        reply = resp.choices[0].message.content
        history.append({"role": "assistant", "content": reply})
        print(f"Assistant: {reply}\n")

        # 4. Store exchange
        await v.add(
            messages=[{"role": "user", "content": user_input},
                      {"role": "assistant", "content": reply}],
            session_id=session_id,
            user_id=user_id,
        )

    await v.close()

asyncio.run(chat("demo-user"))

More examples in /examples:

quickstart.py — fully local, zero API keys (Ollama)
openai_agent.py — OpenAI agent loop

Storage Backends

# SQLite (default) — zero config, starts instantly
v = Vektori()

# PostgreSQL + pgvector — production scale
v = Vektori(database_url="postgresql://localhost:5432/vektori")

# Neo4j — native graph traversal for Episode layer
v = Vektori(
    storage_backend="neo4j",
    database_url="bolt://localhost:7687",
    embedding_dimension=1024,   # must match your embedding model
)

# Qdrant — dedicated vector DB, cloud-ready
v = Vektori(
    storage_backend="qdrant",
    database_url="http://localhost:6333",
    embedding_dimension=1024,
)

# Qdrant Cloud
v = Vektori(
    storage_backend="qdrant",
    database_url="https://your-cluster.qdrant.io",
    qdrant_api_key="your-api-key",
    embedding_dimension=1024,
)

# Milvus — high-scale vector store with partition-key isolation
v = Vektori(
    storage_backend="milvus",
    database_url="http://localhost:19530",
    embedding_dimension=1024,
)

# Milvus / Zilliz Cloud
v = Vektori(
    storage_backend="milvus",
    database_url="https://your-cluster-endpoint",
    milvus_token="your-api-key-or-token",
    embedding_dimension=1024,
)

# In-memory — tests / CI
v = Vektori(storage_backend="memory")

All backends via Docker:

git clone https://github.com/vektori-ai/vektori
cd vektori
docker compose up -d                 # starts Postgres, Neo4j, Qdrant, and Milvus

# Postgres
DATABASE_URL=postgresql://vektori:vektori@localhost:5432/vektori python examples/quickstart_postgres.py

# Neo4j
VEKTORI_STORAGE_BACKEND=neo4j VEKTORI_DATABASE_URL=bolt://localhost:7687 vektori add "I prefer dark mode" --user-id u1

# Qdrant
VEKTORI_STORAGE_BACKEND=qdrant VEKTORI_DATABASE_URL=http://localhost:6333 vektori add "I prefer dark mode" --user-id u1

# Milvus
VEKTORI_STORAGE_BACKEND=milvus VEKTORI_DATABASE_URL=http://localhost:19530 vektori add "I prefer dark mode" --user-id u1

# Milvus Cloud
MILVUS_TOKEN=your-api-key VEKTORI_STORAGE_BACKEND=milvus VEKTORI_DATABASE_URL=https://your-cluster-endpoint vektori add "I prefer dark mode" --user-id u1

CLI storage flags:

vektori config --storage-backend qdrant --database-url http://localhost:6333
vektori config --storage-backend milvus --database-url http://localhost:19530
vektori add "my note" --user-id u1
vektori search "preferences" --user-id u1

Model Support

Bring whatever model stack you have. Works with 10 providers out of the box.

# OpenAI
v = Vektori(
    embedding_model="openai:text-embedding-3-small",
    extraction_model="openai:gpt-4o-mini",
)

# Azure OpenAI
# Ensure AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY are set
# Note: The string after "azure:" must match your specific Azure deployment names
v = Vektori(
    embedding_model="azure:my-embedding-deployment",
    extraction_model="azure:my-gpt-4o-deployment",
)

# GitHub Models (Copilot)
# Requires GITHUB_TOKEN. You can get one by running `./scripts/get_github_token.sh`
v = Vektori(
    embedding_model="github:text-embedding-3-small",
    extraction_model="github:gpt-4o",
)

# Anthropic
v = Vektori(
    embedding_model="anthropic:voyage-3",
    extraction_model="anthropic:claude-haiku-4-5-20251001",
)

# Fully local, no API keys, no internet
v = Vektori(
    embedding_model="ollama:nomic-embed-text",
    extraction_model="ollama:llama3",
)

# Sentence Transformers (local, no Ollama required)
v = Vektori(embedding_model="sentence-transformers:all-MiniLM-L6-v2")

# BGE-M3 — multilingual, 1024-dim, best local embeddings we've found
v = Vektori(embedding_model="bge:BAAI/bge-m3")

# LiteLLM — 100+ providers through one interface
v = Vektori(extraction_model="litellm:groq/llama3-8b-8192")

NVIDIA NIM - GPU-optimized models via NVIDIA NIM.

# NVIDIA embedding models (Matryoshka: 384-2048 dimensions)
v = Vektori(
    embedding_model="nvidia:llama-nemotron-embed-1b-v2",
    embedding_dimension=1024,  # Optional: 384, 512, 768, 1024, or 2048
)

# NVIDIA LLM models (nvidia/ prefix auto-added)
v = Vektori(extraction_model="nvidia:llama-3.3-nemotron-super-49b-v1")

# Third-party models hosted on NVIDIA NIM (use full path)
v = Vektori(extraction_model="nvidia:z-ai/glm5")

Why Not Mem0 / Zep?

	Mem0 / Zep	Vektori
Memory model	Entity-relation triples	Three-layer sentence graph
What you get	The answer	The answer + reasoning + story
Patterns beyond facts	Manual graph queries	Auto-discovered (Episode layer)
Default backend	Requires external DB	SQLite, zero config
Fully local / offline	No	Yes (Ollama, BGE-M3, SentenceTransformers)
License	Partial OSS	Apache 2.0

Mem0 and Zep are solid tools. But they compress conversations into triples, so you get the what but not the why or how it changed over time. That matters when you're building agents that need to reason about a person's trajectory, not just their current state.

Contributing

Issues, PRs, and ideas welcome. See CONTRIBUTING.md.

git clone https://github.com/vektori-ai/vektori
cd vektori
pip install -e ".[dev]"
pytest tests/unit/

License

Apache 2.0. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

vektori

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2.dev121 pre-release

May 5, 2026

0.1.2.dev120 pre-release

May 5, 2026

0.1.2.dev116 pre-release

May 5, 2026

0.1.2.dev115 pre-release

May 4, 2026

0.1.2.dev114 pre-release

May 4, 2026

0.1.2.dev103 pre-release

May 1, 2026

0.1.2.dev102 pre-release

Apr 30, 2026

0.1.2.dev94 pre-release

Apr 30, 2026

0.1.2.dev93 pre-release

Apr 29, 2026

0.1.2.dev92 pre-release

Apr 29, 2026

0.1.2.dev91 pre-release

Apr 29, 2026

0.1.2.dev90 pre-release

Apr 29, 2026

0.1.2.dev89 pre-release

Apr 29, 2026

0.1.2.dev88 pre-release

Apr 25, 2026

0.1.2.dev87 pre-release

Apr 21, 2026

0.1.2.dev85 pre-release

Apr 21, 2026

0.1.2.dev79 pre-release

Apr 19, 2026

0.1.2.dev66 pre-release

Apr 17, 2026

0.1.2.dev55 pre-release

Apr 16, 2026

0.1.2.dev48 pre-release

Apr 14, 2026

0.1.2.dev38 pre-release

Apr 14, 2026

This version

0.1.2.dev37 pre-release

Apr 14, 2026

0.1.2.dev33 pre-release

Apr 13, 2026

0.1.2.dev27 pre-release

Apr 13, 2026

0.1.2.dev21 pre-release

Apr 11, 2026

0.1.2.dev20 pre-release

Apr 11, 2026

0.1.2.dev18 pre-release

Apr 10, 2026

0.1.2.dev14 pre-release

Apr 10, 2026

0.1.1

Mar 18, 2026

0.1.0

Feb 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vektori-0.1.2.dev37.tar.gz (299.8 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vektori-0.1.2.dev37-py3-none-any.whl (132.8 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file vektori-0.1.2.dev37.tar.gz.

File metadata

Download URL: vektori-0.1.2.dev37.tar.gz
Upload date: Apr 14, 2026
Size: 299.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vektori-0.1.2.dev37.tar.gz
Algorithm	Hash digest
SHA256	`d6d4c125c053013e821164ff94bbefb11ad339619fccd62cc74b52e562bbc2b0`
MD5	`63787b7783e232f165edbe5f83a2ecd9`
BLAKE2b-256	`af0e003c74106576e948be6b8068c0d6d3028ade1af2a559fd74bb538862f3a1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vektori-0.1.2.dev37.tar.gz:

Publisher: publish.yml on vektori-ai/vektori

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vektori-0.1.2.dev37.tar.gz
- Subject digest: d6d4c125c053013e821164ff94bbefb11ad339619fccd62cc74b52e562bbc2b0
- Sigstore transparency entry: 1293528343
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: vektori-ai/vektori@1692c98643c5153804e228e90dbffc6c89f6f68b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/vektori-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1692c98643c5153804e228e90dbffc6c89f6f68b
- Trigger Event: push

File details

Details for the file vektori-0.1.2.dev37-py3-none-any.whl.

File metadata

Download URL: vektori-0.1.2.dev37-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 132.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vektori-0.1.2.dev37-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5fd4b40b84674c0b2d20e37a4c8571b9e1e2f38fecce37e8836ed894d038b4ea`
MD5	`15d9540597ef6c4e68ce9822a4539870`
BLAKE2b-256	`eb42ae8b7496a3e6ff64f27f5a415eef8da2bc4ac3de4eb3cbcad315841c5c80`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vektori-0.1.2.dev37-py3-none-any.whl:

Publisher: publish.yml on vektori-ai/vektori

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vektori-0.1.2.dev37-py3-none-any.whl
- Subject digest: 5fd4b40b84674c0b2d20e37a4c8571b9e1e2f38fecce37e8836ed894d038b4ea
- Sigstore transparency entry: 1293528373
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: vektori-ai/vektori@1692c98643c5153804e228e90dbffc6c89f6f68b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/vektori-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1692c98643c5153804e228e90dbffc6c89f6f68b
- Trigger Event: push

vektori 0.1.2.dev37

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Vektori

Benchmarks

Install

30-Second Quickstart

Retrieval Depths

Build an Agent with Memory

Storage Backends

Model Support

NVIDIA NIM - GPU-optimized models via NVIDIA NIM.

Why Not Mem0 / Zep?

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance