Skip to main content

Build system for agent memory

Project description

Synix logo

 ███████╗██╗   ██╗███╗   ██╗██╗██╗  ██╗
 ██╔════╝╚██╗ ██╔╝████╗  ██║██║╚██╗██╔╝
 ███████╗ ╚████╔╝ ██╔██╗ ██║██║ ╚███╔╝
 ╚════██║  ╚██╔╝  ██║╚██╗██║██║ ██╔██╗
 ███████║   ██║   ██║ ╚████║██║██╔╝ ██╗
 ╚══════╝   ╚═╝   ╚═╝  ╚═══╝╚═╝╚═╝  ╚═╝

A build system for agent memory.

The Problem

Agent memory hasn't converged. Mem0, Letta, Zep, LangMem — each bakes in a different architecture because the right one depends on your domain and changes as your agent evolves. Most systems force you to commit to a schema early. Changing your approach means migrations or starting over.

What Synix Does

Conversations are sources. Prompts are build rules. Summaries and world models are artifacts. Declare your memory architecture in Python, build it, then change it — only affected layers rebuild. Trace any artifact back through the dependency graph to its source conversation.

uvx synix build pipeline.py
uvx synix validate
uvx synix search "return policy"

Quick Start

uvx synix init scaffolds a working project with source files, a multi-layer pipeline, and a validator.

uvx synix init my-project
cd my-project

Build the pipeline (requires an LLM API key — see pipeline.py for config):

uvx synix build

Browse what was built:

uvx synix list                    # all artifacts, grouped by layer
uvx synix show final-report       # render an artifact as markdown
uvx synix show final-report --raw # full JSON with metadata and artifact IDs

Validate and search:

uvx synix validate                # run declared validators
uvx synix search "hiking"         # full-text search across all indexed layers

Using Your Build Output

After a build, Synix gives you two things: a search index and flat artifact files.

Search via CLI:

uvx synix search "return policy"
uvx synix search "warranty terms" --top-k 5 --trace

Use the artifacts directly:

Build output lives in ./build/ — JSON files per artifact, a manifest.json index, and a SQLite FTS5 database. Read them, copy them, or point any tool that speaks SQLite at search.db.

ls build/layer2-cs_product_brief/
sqlite3 build/search.db "SELECT label, layer_name FROM search_index LIMIT 5"

Entity Model

Synix has two kinds of identity. Understanding the difference explains how caching, provenance, and search work.

Label — a human-readable semantic name like ep-conv-123 or monthly-2024-03. Labels are stable across rebuilds; they identify what an artifact represents. You use labels in synix show, synix lineage, and search results.

Artifact ID — a SHA256 content hash like sha256:a1b2c3.... Artifact IDs change whenever the content changes. They are the true identity for caching and provenance — if the hash matches, the artifact hasn't changed.

Artifact
├── label           "ep-conv-123"          # semantic name (stable)
├── artifact_id     "sha256:a1b2c3..."     # content hash (changes on rebuild)
├── artifact_type   "episode"              # transcript, episode, rollup, core_memory
├── content         "Summary of..."        # the actual text
├── input_ids       ["sha256:x...", ...]   # artifact IDs of inputs that produced this
├── prompt_id       "episode_summary_v3"   # prompt template version (LLM-derived only)
└── metadata        {date, title, ...}     # flexible key-value metadata

Provenance traces every artifact back to its inputs. Each provenance record stores the artifact ID (hash), the labels of its parent artifacts, the prompt used, and the model config. This is how synix lineage reconstructs the full dependency chain from a core memory document back to the original conversations.

Layers are named levels in the build DAG — transcripts → episodes → rollups → core. Each layer declares a transform and a grouping strategy. The pipeline is the full declared architecture: layers, projections, validators.

Defining a Pipeline

A pipeline is a Python file that declares your memory architecture: sources, transforms, projections, and validators.

# pipeline.py
from synix import Pipeline, Layer, Projection, ValidatorDecl, FixerDecl

pipeline = Pipeline("my-memory")
pipeline.source_dir = "./exports"
pipeline.build_dir = "./build"
pipeline.llm_config = {
    "model": "claude-sonnet-4-20250514",
    "temperature": 0.3,
    "max_tokens": 1024,
}

# Layer 0: auto-detect and parse source files
pipeline.add_layer(Layer(name="transcripts", level=0, transform="parse"))

# Layer 1: one summary per conversation
pipeline.add_layer(Layer(
    name="episodes", level=1, depends_on=["transcripts"],
    transform="episode_summary", grouping="by_conversation",
))

# Layer 2: group episodes by month
pipeline.add_layer(Layer(
    name="monthly", level=2, depends_on=["episodes"],
    transform="monthly_rollup", grouping="by_month",
))

# Layer 3: synthesize everything into core memory
pipeline.add_layer(Layer(
    name="core", level=3, depends_on=["monthly"],
    transform="core_synthesis", grouping="single",
    context_budget=10000,
))

# Projections — how artifacts become usable
pipeline.add_projection(Projection(
    name="memory-index", projection_type="search_index",
    sources=[
        {"layer": "episodes", "search": ["fulltext"]},
        {"layer": "monthly", "search": ["fulltext"]},
        {"layer": "core", "search": ["fulltext"]},
    ],
))
pipeline.add_projection(Projection(
    name="context-doc", projection_type="flat_file",
    sources=[{"layer": "core"}],
    config={"output_path": "./build/context.md"},
))

# Optional: validators and fixers
pipeline.add_validator(ValidatorDecl(name="pii", config={"severity": "warning"}))
pipeline.add_validator(ValidatorDecl(name="semantic_conflict", config={
    "llm_config": pipeline.llm_config,
}))
pipeline.add_fixer(FixerDecl(name="semantic_enrichment"))

Because pipelines are Python, you can generate layers dynamically:

for topic in ["career", "projects", "health"]:
    pipeline.add_layer(Layer(
        name=f"topic-{topic}", level=2, depends_on=["episodes"],
        transform="topical_rollup", grouping="by_topic",
        config={"topics": [topic]},
    ))

Built-in Components

Sources

Drop files into source_dir — the parse transform auto-detects format by file structure.

Format Extensions Notes
ChatGPT .json conversations.json exports. Handles regeneration branches via current_node.
Claude .json Claude conversation exports with chat_messages arrays.
Text / Markdown .txt, .md YAML frontmatter support. Auto-detects conversation turns (User: / Assistant: prefixes).

Transforms

Name Grouping What it does
parse Auto-discovers and parses all source files into transcript artifacts.
episode_summary by_conversation 1 transcript → 1 episode summary via LLM.
monthly_rollup by_month Groups episodes by calendar month, synthesizes each via LLM.
topical_rollup by_topic Groups episodes by user-declared topics. Requires config={"topics": [...]}.
core_synthesis single All rollups → single core memory document. Respects context_budget.
merge Groups artifacts by content similarity (Jaccard), merges above threshold.

Projections

Type Output Purpose
search_index build/search.db SQLite FTS5 index across selected layers. Optional embedding support for semantic/hybrid search.
flat_file build/context.md Renders artifacts as markdown. Ready to paste into an LLM system prompt.

Validators

Name What it checks
mutual_exclusion Merged artifacts don't mix values of a metadata field (e.g., customer_id).
required_field Artifacts in specified layers have a required metadata field.
pii Detects credit cards, SSNs, emails, phone numbers in content.
semantic_conflict LLM-based detection of contradictions across synthesized artifacts.

Fixers

Name What it fixes
semantic_enrichment Resolves semantic conflicts by rewriting with source episode context. Interactive approval.

CLI Reference

Command What it does
uvx synix init <name> Scaffold a new project with sources, pipeline, and README.
uvx synix build Run the pipeline. Only rebuilds what changed.
uvx synix plan Dry-run — show what would build without running transforms.
uvx synix plan --explain-cache Plan with inline cache decision reasons per artifact.
uvx synix list [layer] List all artifacts with short artifact IDs, optionally filtered by layer.
uvx synix show <id> Display an artifact's content. Resolves by label or artifact ID prefix. --raw for JSON.
uvx synix search <query> Full-text search across indexed layers. --mode hybrid for semantic.
uvx synix validate Run declared validators against build artifacts.
uvx synix fix LLM-assisted repair of validation violations.
uvx synix verify Check build integrity (hashes, provenance).
uvx synix lineage <id> Show the full provenance chain for an artifact.
uvx synix clean Delete the build directory.

Commands that take a pipeline path (build, plan, validate, fix, clean) default to ./pipeline.py in the current directory.

Key Capabilities

Incremental rebuilds — Change a prompt or add new conversations. Only downstream artifacts reprocess.

Fingerprint-based caching — Every artifact stores a build fingerprint capturing inputs, prompt, model config, transform config, and transform source code. Change any component and only affected artifacts rebuild. See docs/cache-semantics.md for the full rebuild trigger matrix.

Cache explainabilityuvx synix plan --explain-cache shows inline reasons for every cache hit or miss directly in the plan tree, so you can see exactly which fingerprint component caused a rebuild.

Altitude-aware search — Query episode summaries, monthly rollups, or core memory. Drill into provenance from any result.

Full provenance — Every artifact chains back to the source conversations that produced it, through every transform in between.

Git-like artifact resolutionuvx synix show resolves artifacts by unique prefix of label or artifact ID, just like git show resolves commits.

Validation and repair — Detect semantic contradictions and PII leaks across artifacts, then fix them with LLM-assisted rewrites.

Architecture evolution — Swap monthly rollups for topic-based clustering. Transcripts and episodes stay cached. No migration scripts.

Where Synix Fits

Mem0 Letta Zep LangMem Synix
Approach API-first memory store Agent-managed memory Temporal knowledge graph Taxonomy-driven memory Build system with pipelines
Incremental rebuilds Yes
Provenance tracking Full chain to source
Architecture changes Migration Migration Migration Migration Rebuild
Schema Fixed Fixed Fixed Fixed You define it

Synix is not a memory store. It's the build system that produces one.

Known Limitations

These are the highest-priority open issues. See the issue tracker for the full backlog.

Issue Priority Description
#53 P0 Parser metadata passthrough — YAML frontmatter fields in source files are not propagated to artifact metadata. Custom fields like author or date are silently dropped.
#52 P0 Validate/verify and trace artifacts — Trace artifacts from provenance tracking can trigger false positives in validators that expect only content artifacts.
#57 P1 Rich search output — Search results show artifact labels but not inline content snippets or provenance context. Requires multiple commands to get the full picture.
#56 P1 Provenance summarization — Lineage output is raw dependency chains. No summarized view or filtering for large graphs.
#55 P1 Pipeline-relative imports — Custom transforms using relative imports fail when the pipeline file is outside the project root.
#54 P1 Non-interactive automation mode — No --quiet / --json output mode for CI or scripted usage. Rich formatting assumes a TTY.
#33 Embedding failures are silent — If embedding generation fails, search indexing silently falls back to keyword-only instead of erroring.

Removed source files are not cleaned up — Deleting a source file does not remove its downstream artifacts. Run uvx synix clean and rebuild to purge orphans.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synix-0.10.3.tar.gz (3.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synix-0.10.3-py3-none-any.whl (185.8 kB view details)

Uploaded Python 3

File details

Details for the file synix-0.10.3.tar.gz.

File metadata

  • Download URL: synix-0.10.3.tar.gz
  • Upload date:
  • Size: 3.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synix-0.10.3.tar.gz
Algorithm Hash digest
SHA256 463021d1b0d63fec5fa2e65edc7fc66a7a131d21b1639e57ef70cef72965e435
MD5 4026cbbb659f8082e37f16bf96de4262
BLAKE2b-256 3562eefaa895b85861553d1923505ec736c87de5c21cb041e28f5111a737812a

See more details on using hashes here.

Provenance

The following attestation bundles were made for synix-0.10.3.tar.gz:

Publisher: release.yml on marklubin/synix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synix-0.10.3-py3-none-any.whl.

File metadata

  • Download URL: synix-0.10.3-py3-none-any.whl
  • Upload date:
  • Size: 185.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synix-0.10.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f4c86b5bd5349cb1a2539c5e6ce22e0e19437684c20f9f69135958705c73d0c3
MD5 5665794608c1104ee3615cfbd933d6ab
BLAKE2b-256 a132c221175d2f87c71f80837b69f936442d702cbc33d99afbf097fdea48f2c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for synix-0.10.3-py3-none-any.whl:

Publisher: release.yml on marklubin/synix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page