Living Collaborative Memory for Forge — hybrid vector + graph + symbolic knowledge base with versioning and provenance.
Project description
forge-memory
Living Collaborative Memory for Forge: a hybrid memory layer that combines semantic retrieval, graph traversal, and deterministic rules in one package.
forge-memory is the package that gives Forge cross-run memory. It stores MemoryEntry objects, versions every write, preserves provenance, and retrieves relevant context through a three-layer strategy:
- Vector search with ChromaDB
- Graph retrieval with NetworkX plus SQLite WAL persistence
- Symbolic filtering and reranking with a rule engine
The result is a memory system that can ingest agent output, survive restarts, handle concurrent writers, and inject relevant context back into future runs.
What ships today
HybridMemoryas the main async API for store, query, ingest, delete, health checks, and statsIngestionPipelinefor chunking text and extracting entitiesMemoryContextInjectorfor enrichingTaskEnvelopeinstances before a runMemoryVersionStorewith append-only version history and lineage queries- Vector-clock based conflict detection for concurrent writes
- Default backends:
ChromaDBBackendfor semantic retrievalNetworkXBackendfor entity and relation traversal with SQLite-backed persistenceRuleEnginefor deterministic post-retrieval control
Installation
Within the Forge workspace:
uv sync
As a standalone package:
pip install forge-memory
Optional extras declared by the package:
pip install 'forge-memory[neo4j]'
pip install 'forge-memory[qdrant]'
Requirements:
- Python 3.11+
forge-core==0.1.0
Quickstart
import asyncio
from forge_core.types import MemoryQuery
from forge_memory import HybridMemory
async def main() -> None:
memory = HybridMemory()
await memory.ingest_text(
content=(
"The research agent found that retrieval caching reduced latency "
"and that verified prompts improved answer quality."
),
source_run_id="run-123",
source_agent_id="researcher",
tags={"topic": "retrieval", "verified": "true"},
relations=[("retrieval caching", "improves", "latency")],
)
results = await memory.query(
MemoryQuery(
text="what improved retrieval latency?",
top_k=5,
)
)
for entry, score in results:
print(f"{score:.2f} {entry.id} {entry.content}")
asyncio.run(main())
Retrieval model
HybridMemory.query() runs three stages:
- Query the vector backend and graph backend concurrently
- Fuse their ranked results with Reciprocal Rank Fusion
- Apply symbolic rules for vetoes, boosts, demotions, and tag enrichment
This makes the package useful in cases where semantic similarity alone is not enough. Graph hops can surface structurally related entries, and rules can enforce deterministic business constraints after retrieval.
Persistence and durability
The default graph backend uses SQLite in WAL mode and keeps an in-memory networkx.DiGraph synchronized to disk. The current implementation is designed to preserve state across process restarts and to avoid the fragility of the older pickle-only approach.
By default, package data is persisted under the user's home directory:
- Vector store:
~/.forge/memory/chroma - Graph store:
~/.forge/memory/graph.db - Version log:
~/.forge/memory/versions.jsonl
You can override persistence paths by constructing the backends directly and passing them into HybridMemory.
Versioning and concurrent writers
Every stored MemoryEntry is versioned and recorded in an append-only log with provenance fields such as:
versionsupersedessource_run_idsource_agent_idevidence
For multi-writer scenarios, MemoryVersionStore.submit_write() uses vector clocks to classify incoming writes as:
beforeafterequalconcurrent
Concurrent writes are resolved through a pluggable ConflictResolver. The default resolver is last-writer-wins with a deterministic writer-id tiebreak.
Text ingestion
IngestionPipeline converts raw text into MemoryEntry objects by:
- Splitting long text into overlapping chunks
- Extracting entities
- Copying metadata such as tags, evidence, relations, and source identifiers
For short content, ingestion yields a single entry. For longer content, it emits multiple chunks up to a configurable maximum.
Example:
from forge_memory.ingestion import IngestionPipeline
pipeline = IngestionPipeline(chunk_size=512, chunk_overlap=64)
entries = await pipeline.ingest(
content="Long-form report text goes here...",
source_run_id="run-456",
source_agent_id="writer",
tags={"domain": "ops"},
)
Context injection
MemoryContextInjector is the read path that closes the loop between stored memory and future runs. It queries memory, formats relevant hits, and enriches a TaskEnvelope with both:
- Prompt-friendly text in
memory_context - Structured hits in
context["forge_memory"]
Example:
import os
from forge_core.types import MemoryConfig, TaskEnvelope
from forge_memory import HybridMemory, MemoryContextInjector
os.environ["FORGE_ENABLE_MEMORY_INJECTION"] = "1"
memory = HybridMemory()
injector = MemoryContextInjector(
memory=memory,
config=MemoryConfig(enable_context_injection=True, top_k=3, min_relevance=0.3),
)
envelope = TaskEnvelope(input={"query": "Summarize the latest retrieval findings"})
enriched = await injector.inject(envelope)
Design contract:
- Injection is best-effort
- Memory failures should not abort the run
- The original envelope passes through unchanged when injection is disabled or yields no relevant hits
Feature flags
Two optional behaviors are gated by environment flags:
FORGE_ENABLE_MEMORY_INJECTION=1- Enables
MemoryContextInjectorto enrich envelopes before a run
- Enables
FORGE_ENABLE_LLM_ENTITY_EXTRACTION=1- Enables LLM-based entity extraction when an
llm_clientis supplied toIngestionPipeline
- Enables LLM-based entity extraction when an
Without those flags, the package stays on the default deterministic paths:
- No context injection
- Regex-based entity extraction
Public API
Top-level exports:
HybridMemoryMemoryContextInjectorMemoryVersionStoreVectorClockConflictResolverConflictContextLastWriterWinsResolver
Primary modules:
forge_memory.hybridforge_memory.ingestionforge_memory.injectionforge_memory.versioningforge_memory.vector.chromadb_backendforge_memory.graph.networkx_backendforge_memory.symbolic.rules
Notes for backend swaps
HybridMemory is built around structural typing rather than a hard inheritance tree. If a backend implements the expected async methods such as store(), query(), delete(), and health_check(), it can be plugged in directly.
That makes it straightforward to replace the defaults with external systems such as Qdrant or Neo4j while keeping the same high-level memory API.
Testing
From the repository root:
pytest packages/forge-memory/tests
The test suite covers:
- reciprocal-rank fusion
- symbolic rules
- ingestion behavior
- version history and lineage
- vector-clock conflict handling
- graph persistence and legacy migration
- circuit-breaker behavior during backend failure
- context injection contracts
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forge_os_memory-0.2.0.tar.gz.
File metadata
- Download URL: forge_os_memory-0.2.0.tar.gz
- Upload date:
- Size: 36.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c963a7e4fe12bc3fc20dbf897f12689aa3e3b6222fc1852c47a634bff47f0182
|
|
| MD5 |
647d72ad02aba876c9ff309d0fc58786
|
|
| BLAKE2b-256 |
85dc77d716bb48eb47fe2c865bb0e900788dcbf9ef8f7394e3634c6b8cf7a619
|
File details
Details for the file forge_os_memory-0.2.0-py3-none-any.whl.
File metadata
- Download URL: forge_os_memory-0.2.0-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d0824b81744466840dd3eb5509136282e59272dd4a9c06118af14a7d3c394b0
|
|
| MD5 |
dcd48df4b02333b072dab773cb2b14f2
|
|
| BLAKE2b-256 |
e24b54456330dc6ebc26507190afa51f2e81263c8b144438db70b12ba93cff0d
|