Skip to main content

Build system for agent memory

Project description

Synix logo

 ███████╗██╗   ██╗███╗   ██╗██╗██╗  ██╗
 ██╔════╝╚██╗ ██╔╝████╗  ██║██║╚██╗██╔╝
 ███████╗ ╚████╔╝ ██╔██╗ ██║██║ ╚███╔╝
 ╚════██║  ╚██╔╝  ██║╚██╗██║██║ ██╔██╗
 ███████║   ██║   ██║ ╚████║██║██╔╝ ██╗
 ╚══════╝   ╚═╝   ╚═╝  ╚═══╝╚═╝╚═╝  ╚═╝

A build system for agent memory.

The Problem

Agent memory hasn't converged. Mem0, Letta, Zep, LangMem — each bakes in a different architecture because the right one depends on your domain and changes as your agent evolves. Most systems force you to commit to a schema early. Changing your approach means migrations or starting over.

What Synix Does

Conversations are sources. Prompts are build rules. Summaries and world models are artifacts. Declare your memory architecture in Python, build it, then change it — only affected layers rebuild. Trace any artifact back through the dependency graph to its source conversation.

uvx synix build pipeline.py
uvx synix search "return policy"
uvx synix validate                # experimental

Quick Start

uvx synix init scaffolds a working project with source files, a multi-layer pipeline, and a validator.

uvx synix init my-project
cd my-project

Build the pipeline (requires an LLM API key — see pipeline.py for config):

uvx synix build

Browse what was built:

uvx synix list                    # all artifacts, grouped by layer
uvx synix show final-report       # render an artifact as markdown
uvx synix show final-report --raw # full JSON with metadata and artifact IDs

Search and validate:

uvx synix search "hiking"         # full-text search across all indexed layers
uvx synix validate                # run declared validators (experimental)

Using Your Build Output

After a build, Synix gives you two things: a search index and flat artifact files.

Search via CLI:

uvx synix search "return policy"
uvx synix search "warranty terms" --top-k 5 --trace

Use the artifacts directly:

Build output lives in ./build/ — JSON files per artifact, a manifest.json index, and a SQLite FTS5 database. Read them, copy them, or point any tool that speaks SQLite at search.db.

ls build/layer2-cs_product_brief/
sqlite3 build/search.db "SELECT label, layer_name FROM search_index LIMIT 5"

Entity Model

Synix has two kinds of identity. Understanding the difference explains how caching, provenance, and search work.

Label — a human-readable semantic name like ep-conv-123 or monthly-2024-03. Labels are stable across rebuilds; they identify what an artifact represents. You use labels in synix show, synix lineage, and search results.

Artifact ID — a SHA256 content hash like sha256:a1b2c3.... Artifact IDs change whenever the content changes. They are the true identity for caching and provenance — if the hash matches, the artifact hasn't changed.

Artifact
├── label           "ep-conv-123"          # semantic name (stable)
├── artifact_id     "sha256:a1b2c3..."     # content hash (changes on rebuild)
├── artifact_type   "episode"              # transcript, episode, rollup, core_memory
├── content         "Summary of..."        # the actual text
├── input_ids       ["sha256:x...", ...]   # artifact IDs of inputs that produced this
├── prompt_id       "episode_summary_v3"   # prompt template version (LLM-derived only)
└── metadata        {date, title, ...}     # flexible key-value metadata

Provenance traces every artifact back to its inputs. Each provenance record stores the artifact ID (hash), the labels of its parent artifacts, the prompt used, and the model config. This is how synix lineage reconstructs the full dependency chain from a core memory document back to the original conversations.

Layers are typed Python objects in a DAG — Source("transcripts") → EpisodeSummary("episodes") → MonthlyRollup("monthly") → CoreSynthesis("core"). Dependencies are expressed as object references via depends_on. The pipeline is the full declared architecture: layers, projections, validators.

Defining a Pipeline

A pipeline is a Python file that declares your memory architecture. Layers are real Python objects — Source for inputs, transform classes for LLM steps, SearchIndex and FlatFile for outputs.

# pipeline.py
from synix import Pipeline, Source, SearchIndex, FlatFile
from synix.transforms import EpisodeSummary, MonthlyRollup, CoreSynthesis

pipeline = Pipeline("my-memory")
pipeline.source_dir = "./sources"
pipeline.build_dir = "./build"
pipeline.llm_config = {
    "model": "claude-sonnet-4-20250514",
    "temperature": 0.3,
    "max_tokens": 1024,
}

# Layer 0: auto-detect and parse source files
transcripts = Source("transcripts")

# Layer 1: one summary per conversation
episodes = EpisodeSummary("episodes", depends_on=[transcripts])

# Layer 2: group episodes by month
monthly = MonthlyRollup("monthly", depends_on=[episodes])

# Layer 3: synthesize everything into core memory
core = CoreSynthesis("core", depends_on=[monthly], context_budget=10000)

pipeline.add(transcripts, episodes, monthly, core)

# Projections — how artifacts become usable
pipeline.add(
    SearchIndex(
        "memory-index",
        sources=[episodes, monthly, core],
        search=["fulltext", "semantic"],
        embedding_config={
            "provider": "fastembed",
            "model": "BAAI/bge-small-en-v1.5",
        },
    )
)
pipeline.add(
    FlatFile("context-doc", sources=[core], output_path="./build/context.md")
)

# Optional: validators and fixers (experimental — APIs may change)
from synix.validators import PII, SemanticConflict
from synix.fixers import SemanticEnrichment

pipeline.add_validator(PII(severity="warning"))
pipeline.add_validator(SemanticConflict())
pipeline.add_fixer(SemanticEnrichment())

Because pipelines are Python, you can generate layers dynamically:

from synix.transforms import TopicalRollup

for topic in ["career", "projects", "health"]:
    pipeline.add(TopicalRollup(
        f"topic-{topic}", depends_on=[episodes],
        config={"topics": [topic]},
    ))

Built-in Components

Sources

Drop files into source_dir — the parse transform auto-detects format by file structure.

Format Extensions Notes
ChatGPT .json conversations.json exports. Handles regeneration branches via current_node.
Claude .json Claude conversation exports with chat_messages arrays.
Text / Markdown .txt, .md YAML frontmatter support. Auto-detects conversation turns (User: / Assistant: prefixes).

Transforms

Import from synix.transforms:

Class What it does
EpisodeSummary 1 transcript → 1 episode summary via LLM.
MonthlyRollup Groups episodes by calendar month, synthesizes each via LLM.
TopicalRollup Groups episodes by user-declared topics. Requires config={"topics": [...]}.
CoreSynthesis All rollups → single core memory document. Respects context_budget.
Merge Groups artifacts by content similarity (Jaccard), merges above threshold.

Projections

Import from synix:

Class Output Purpose
SearchIndex build/search.db SQLite FTS5 index across selected layers. Optional embedding support for semantic/hybrid search.
FlatFile build/context.md Renders artifacts as markdown. Ready to paste into an LLM system prompt.

Validators (Experimental)

Note: The validate/fix workflow is experimental. APIs and output formats may change in future releases.

Import from synix.validators:

Class What it checks
MutualExclusion Merged artifacts don't mix values of a metadata field (e.g., customer_id).
RequiredField Artifacts in specified layers have a required metadata field.
PII Detects credit cards, SSNs, emails, phone numbers in content.
SemanticConflict LLM-based detection of contradictions across synthesized artifacts.
Citation Verifies artifacts cite their source artifacts with valid URIs.

Fixers (Experimental)

Import from synix.fixers:

Class What it fixes
SemanticEnrichment Resolves semantic conflicts by rewriting with source episode context. Interactive approval.
CitationEnrichment Adds missing citation references to artifacts.

CLI Reference

Command What it does
uvx synix init <name> Scaffold a new project with sources, pipeline, and README.
uvx synix build Run the pipeline. Only rebuilds what changed.
uvx synix plan Dry-run — show what would build without running transforms.
uvx synix plan --explain-cache Plan with inline cache decision reasons per artifact.
uvx synix list [layer] List all artifacts with short artifact IDs, optionally filtered by layer.
uvx synix show <id> Display an artifact's content. Resolves by label or artifact ID prefix. --raw for JSON.
uvx synix search <query> Full-text search across indexed layers. --mode hybrid for semantic.
uvx synix validate (Experimental) Run declared validators against build artifacts.
uvx synix fix (Experimental) LLM-assisted repair of validation violations.
uvx synix verify Check build integrity (hashes, provenance).
uvx synix lineage <id> Show the full provenance chain for an artifact.
uvx synix clean Delete the build directory.

Commands that take a pipeline path (build, plan, validate, fix, clean) default to ./pipeline.py in the current directory.

Key Capabilities

Incremental rebuilds — Change a prompt or add new conversations. Only downstream artifacts reprocess.

Fingerprint-based caching — Every artifact stores a build fingerprint capturing inputs, prompt, model config, transform config, and transform source code. Change any component and only affected artifacts rebuild. See docs/cache-semantics.md for the full rebuild trigger matrix.

Cache explainabilityuvx synix plan --explain-cache shows inline reasons for every cache hit or miss directly in the plan tree, so you can see exactly which fingerprint component caused a rebuild.

Altitude-aware search — Query episode summaries, monthly rollups, or core memory. Drill into provenance from any result.

Full provenance — Every artifact chains back to the source conversations that produced it, through every transform in between.

Git-like artifact resolutionuvx synix show resolves artifacts by unique prefix of label or artifact ID, just like git show resolves commits.

Validation and repair — Detect semantic contradictions and PII leaks across artifacts, then fix them with LLM-assisted rewrites.

Architecture evolution — Swap monthly rollups for topic-based clustering. Transcripts and episodes stay cached. No migration scripts.

Where Synix Fits

Mem0 Letta Zep LangMem Synix
Approach API-first memory store Agent-managed memory Temporal knowledge graph Taxonomy-driven memory Build system with pipelines
Incremental rebuilds Yes
Provenance tracking Full chain to source
Architecture changes Migration Migration Migration Migration Rebuild
Schema Fixed Fixed Fixed Fixed You define it

Synix is not a memory store. It's the build system that produces one.

Known Limitations

These are the highest-priority open issues. See the issue tracker for the full backlog.

Issue Priority Description
#53 P0 Parser metadata passthrough — YAML frontmatter fields in source files are not propagated to artifact metadata. Custom fields like author or date are silently dropped.
#52 P0 Validate/verify and trace artifacts — Trace artifacts from provenance tracking can trigger false positives in validators that expect only content artifacts.
#57 P1 Rich search output — Search results show artifact labels but not inline content snippets or provenance context. Requires multiple commands to get the full picture.
#56 P1 Provenance summarization — Lineage output is raw dependency chains. No summarized view or filtering for large graphs.
#55 P1 Pipeline-relative imports — Custom transforms using relative imports fail when the pipeline file is outside the project root.
#54 P1 Non-interactive automation mode — No --quiet / --json output mode for CI or scripted usage. Rich formatting assumes a TTY.
#33 Embedding failures are silent — If embedding generation fails, search indexing silently falls back to keyword-only instead of erroring.

Removed source files are not cleaned up — Deleting a source file does not remove its downstream artifacts. Run uvx synix clean and rebuild to purge orphans.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synix-0.11.0.tar.gz (3.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synix-0.11.0-py3-none-any.whl (222.9 kB view details)

Uploaded Python 3

File details

Details for the file synix-0.11.0.tar.gz.

File metadata

  • Download URL: synix-0.11.0.tar.gz
  • Upload date:
  • Size: 3.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synix-0.11.0.tar.gz
Algorithm Hash digest
SHA256 58117b13f2a378fb13c63a05f33bf6fbcc799d3be7180bcc7dbafee3143285eb
MD5 0c944e4f554199ee4f216dc008938696
BLAKE2b-256 8c2faa33798edeb1061fe6fa17412e9a3852c7cd5cc093cfad269a82f2348af0

See more details on using hashes here.

Provenance

The following attestation bundles were made for synix-0.11.0.tar.gz:

Publisher: release.yml on marklubin/synix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synix-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: synix-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 222.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synix-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8fc75214b9e0e4fb51503f57fdf0ae55fb9c9dfd82484636488547e7d1af5dae
MD5 2efde2fccadd18bc5fb50709cfba1c94
BLAKE2b-256 00f536f22e121be546f4996cc9f60c303ee04967f14ea750e913c29ff82a7321

See more details on using hashes here.

Provenance

The following attestation bundles were made for synix-0.11.0-py3-none-any.whl:

Publisher: release.yml on marklubin/synix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page