Skip to main content

ZettelForge: Agentic Memory System with vector search, knowledge graph, and synthesis

Project description

ZettelForge: Agentic Memory System

Python 3.10+ License: MIT CI Version

A production-grade memory system for AI agents, purpose-built for cyber threat intelligence (CTI). Combines LanceDB vector search with a knowledge graph, blended retrieval, and intent-based query routing to give agents persistent, structured memory across sessions.

Built by Threatengram.

Quick Start

pip install -e .
from zettelforge import MemoryManager

mm = MemoryManager()

# Store a memory
note, status = mm.remember(
    "APT28 uses Cobalt Strike for lateral movement via CVE-2024-1111",
    domain="cti"
)

# Two-phase extraction — LLM extracts facts, decides ADD/UPDATE/DELETE
results = mm.remember_with_extraction(
    "APT28 has shifted tactics. They dropped DROPBEAR and now exploit edge devices.",
    domain="cti"
)

# Retrieve — blends vector similarity + knowledge graph traversal
results = mm.recall("What tools does APT28 use?", k=10)

# Alias resolution works automatically
results = mm.recall_actor("Fancy Bear")  # resolves to APT28

# Synthesize answers from memory
result = mm.synthesize("Summarize APT28 activity")

No external services required. Embeddings run in-process via fastembed (ONNX), knowledge graph uses JSONL, LLM uses llama-cpp-python. Everything runs on your laptop.

Features

  • Two-Phase Extraction Pipeline: Mem0-style selective ingestion -- LLM extracts salient facts with importance scores, then decides ADD/UPDATE/DELETE/NOOP per fact
  • Blended Retrieval: Combines vector similarity + knowledge graph traversal, weighted by query intent (factual, temporal, relational, causal, exploratory)
  • Cross-Encoder Reranking: ms-marco-MiniLM reranks results by query-document relevance
  • Entity Extraction: Automatic indexing of CVEs, threat actors, tools, campaigns, people, locations, organizations, events (10 types, regex + LLM NER)
  • Causal Triple Extraction: LLM infers relationships ("APT28 uses DROPBEAR") and stores them as graph edges
  • Knowledge Graph: Entity nodes, relationship edges, JSONL persistence with append-only writes
  • RAG Synthesis: Answer generation from retrieved memories
  • Intent Classification: Adaptive query routing with per-intent retrieval weights
  • Zero-Server Embeddings: 768-dim vectors generated in-process via fastembed (ONNX, 7ms/embed)
  • Local-First: Runs entirely on local hardware -- no cloud dependencies, no API keys needed
  • MCP Server: Expose memory as tools for Claude Code, OpenClaw, or any MCP-compatible agent

Community vs Enterprise

ZettelForge Community (MIT) includes everything above. It is a complete, production-ready memory system.

ThreatRecall Enterprise (BSL-1.1) adds scale, analyst workflows, and platform integrations for teams running ZettelForge in production:

Enterprise Feature What it adds
TypeDB STIX 2.1 ontology Replaces JSONL graph with TypeDB -- inference rules, 9 entity types, 8 relation types, 36 seeded CTI aliases
Temporal KG queries "What changed since Tuesday?" -- get_changes_since(), get_entity_timeline()
Multi-hop graph traversal traverse_graph() with BFS across relationship chains
Advanced synthesis formats synthesized_brief, timeline_analysis, relationship_map
Report ingestion remember_report() with auto-chunking for long threat reports
OpenCTI integration Bi-directional sync with OpenCTI platform
Sigma rule generation Generate Sigma YAML detection rules from IOCs
Multi-tenant auth OAuth/JWT with per-tenant data isolation
Proactive context injection Auto-load relevant context before agent tasks
# Enterprise install (requires license key)
pip install -e ".[enterprise]"
export THREATENGRAM_LICENSE_KEY="TG-xxxx-xxxx-xxxx-xxxx"

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                           MemoryManager                              │
│  remember()  remember_with_extraction()  recall()  synthesize()      │
├──────────┬───────────┬──────────────┬───────────┬────────────────────┤
│  Note    │  Fact     │   Memory     │  Blended  │   Synthesis        │
│Constructor│ Extractor │  Updater     │ Retriever │   Generator        │
│(enrich)  │(Phase 1)  │(Phase 2)     │(vec+graph)│   (RAG)            │
├──────────┴───────────┴──────────────┼───────────┴────────────────────┤
│       Entity Indexer + Alias        │  Intent Classifier             │
│       Resolver                      │  (factual/temporal/causal)     │
├─────────────────────────────────────┼────────────────────────────────┤
│   Knowledge Graph (JSONL)           │  LanceDB (Vectors)             │
│   Entity nodes + relationship edges │  768-dim fastembed embeddings  │
│   Temporal indexing                 │  Zettelkasten notes            │
│   [Enterprise: TypeDB STIX 2.1]    │  IVF_PQ index                  │
└─────────────────────────────────────┴────────────────────────────────┘

API Reference

Store

# Direct storage
note, status = mm.remember("APT28 targets NATO", domain="cti")

# Two-phase extraction (LLM-powered)
results = mm.remember_with_extraction(content, domain="cti", min_importance=3)

# Report ingestion [Enterprise]
results = mm.remember_report(content, source_url="...", published_date="2026-04-09")

Retrieve

# Blended recall (vector + graph, intent-weighted)
results = mm.recall("What tools does APT28 use?", k=10)

# Entity lookups
mm.recall_cve("CVE-2024-3094")
mm.recall_actor("Fancy Bear")       # alias -> APT28
mm.recall_tool("cobalt-strike")

# Knowledge graph
mm.get_entity_relationships("actor", "apt28")
mm.traverse_graph("actor", "apt28", max_depth=2)  # [Enterprise]

Synthesize

result = mm.synthesize("Summarize APT28 activity")
# Enterprise formats: "synthesized_brief", "timeline_analysis", "relationship_map"

Edition Detection

from zettelforge import is_enterprise, edition_name

print(edition_name())  # "ZettelForge Community" or "ThreatRecall Enterprise by Threatengram"

Deployment

Local Development (recommended to start)

git clone https://github.com/rolandpg/zettelforge.git
cd zettelforge
pip install -e ".[dev]"

# Run tests (no external services needed)
ZETTELFORGE_BACKEND=jsonl pytest tests/ -v --ignore=tests/test_typedb_client.py

# Quick smoke test
python3 -c "
from zettelforge import MemoryManager
mm = MemoryManager()
note, _ = mm.remember('APT28 uses Cobalt Strike', domain='cti')
print(f'Stored: {note.id}')
results = mm.recall('APT28 tools', k=3)
print(f'Recalled: {len(results)} results')
"

With Ollama (better LLM quality)

# Install Ollama and pull a model
ollama pull qwen2.5:3b
ollama serve

# ZettelForge auto-detects Ollama for extraction/synthesis

With TypeDB [Enterprise]

# Start TypeDB
docker compose -f docker/docker-compose.yml up -d

# Set backend
export ZETTELFORGE_BACKEND=typedb

# Seed CTI aliases (one-time)
python3 -c "from zettelforge.schema.seed_aliases import seed_aliases; seed_aliases()"

Configuration

Variable Default Description
AMEM_DATA_DIR ~/.amem Data directory (LanceDB vectors + JSONL notes)
ZETTELFORGE_BACKEND typedb typedb [Enterprise] or jsonl
ZETTELFORGE_EMBEDDING_PROVIDER fastembed fastembed (in-process) or ollama
ZETTELFORGE_LLM_PROVIDER local local (llama-cpp) or ollama
THREATENGRAM_LICENSE_KEY Enterprise license key (TG-xxxx-xxxx-xxxx-xxxx)

See config.default.yaml for all options.

Benchmarks

Benchmark What it measures Score
CTI Retrieval Attribution, CVE linkage, multi-hop 75.0%
LOCOMO (ACL 2024) Conversational memory recall 18.0%
RAGAS Retrieval quality (keyword presence) 78.1%

See the full benchmark report for methodology and analysis.

Contributing

See CONTRIBUTING.md for development setup, code style, and the Community/Enterprise boundary.

License

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zettelforge-2.1.0.tar.gz (235.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zettelforge-2.1.0-py3-none-any.whl (66.0 kB view details)

Uploaded Python 3

File details

Details for the file zettelforge-2.1.0.tar.gz.

File metadata

  • Download URL: zettelforge-2.1.0.tar.gz
  • Upload date:
  • Size: 235.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zettelforge-2.1.0.tar.gz
Algorithm Hash digest
SHA256 1d49704235fad35d77f82c6cffb55ef02fc24aa96f5c754d4551d5b0d4740df9
MD5 638a5d7bf68701efd8fadc1ef2298dd7
BLAKE2b-256 1a1dc447d6284ec3f6be4c4bd5e476eba2fa02949f0dd8c8a0aa9610e54acf49

See more details on using hashes here.

Provenance

The following attestation bundles were made for zettelforge-2.1.0.tar.gz:

Publisher: publish.yml on rolandpg/zettelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zettelforge-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: zettelforge-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 66.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zettelforge-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cdad94d6a391f84b8506bb536b67148ce6c47be2be08aa044cb69d914dd495bb
MD5 2c0885bfb523fc65281df3d695bd2ef7
BLAKE2b-256 0a122310808184f10adcb6793efc2989b04e8cebf3f1bea6dc4e076e55d528f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for zettelforge-2.1.0-py3-none-any.whl:

Publisher: publish.yml on rolandpg/zettelforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page