Skip to main content

LLM-native knowledge framework for agent-maintained Markdown wikis

Project description

Syntheca

CI

English | 한국어

LLM-native knowledge framework for agent-maintained wikis

This guide adapts ideas from Andrej Karpathy's "LLM Wiki" pattern and the "LLM Wiki v2" / agentmemory production lessons. See REFERENCES.md and WIKI-ATTRIBUTION.md for full provenance.

Syntheca is an open-source framework for synthesizing scattered documents and source material into a living Markdown knowledge system that LLM agents can understand, maintain, and reuse.

It is not just a document store. Syntheca preserves raw source material, structures it into typed wiki pages, connects concepts and evidence, and keeps the operating rules that agents need to maintain durable context over time. Documents should not remain static archives; they should become executable context that can be queried, validated, updated, and crystallized back into future work.


What is Syntheca?

Syntheca is a framework with early alpha CLI tooling. It provides:

  • Schema: Canonical templates, page types, and frontmatter specifications
  • Workflows: Step-by-step procedures for ingest, query, lint, crystallization, and migration
  • Runtime adapter notes: Instructions for applying the schema with Claude Code, Codex, OpenCode, and MCP-compatible agents
  • Starter examples: Sample wiki demonstrating the pattern
  • Migration workflow: Agent-guided preservation rules plus alpha dry-run reporting for existing knowledge bases
  • Alpha CLI: syntheca init, syntheca init --dry-run, syntheca init --example, syntheca doctor --fix, syntheca inspect, syntheca lint, syntheca lint --fix, syntheca upgrade --dry-run, and syntheca migrate --mode dry-run for scaffolded workspace experiments

The current alpha focuses on Markdown source material and generated wiki workspaces. Broader sources such as code, conversations, and operational knowledge can be represented as source material today, but built-in connectors and automated extraction are future scope.

What Syntheca is NOT:

  • ❌ A GUI application
  • ❌ A vector database or search engine
  • ❌ A replacement for Obsidian, Notion, or other note-taking apps
  • ❌ A RAG system

What Syntheca IS:

  • ✅ A schema and workflow specification
  • ✅ A set of templates and conventions
  • ✅ Documentation for agents to follow
  • ✅ An agent-guided migration specification and deterministic dry-run report for unstructured knowledge bases
  • ✅ An alpha CLI slice for workspace scaffolding, scaffold previews, example workspaces, structural validation and repair, deterministic inspection, deterministic lint, narrow mechanical lint fixes, upgrade dry-run planning, and migration dry-run

Responsibility Boundary

Syntheca deliberately splits deterministic tooling from semantic wiki work.

Layer Owns Does Not Own
CLI Workspace scaffold, scaffold previews, example workspace generation, structural validation and missing-file repair, deterministic inspection, deterministic lint, narrow mechanical lint fixes, upgrade dry-run plans, migration dry-run reports. Source interpretation, page synthesis, contradiction resolution, broken-link intent, stale-claim judgment, destructive migration.
Agent Ingest, query, crystallization, synthesis, link/index/log maintenance following schema/. Destructive migration, unreviewed raw source changes, final authority on ambiguous claims.
Human Review, approval, curation, destructive operation decisions, release decisions. Repeating mechanical checks that the CLI can perform.

Core Concept: Compilation Over Retrieval

Retrieval-first workflow: Index sources → Retrieve relevant passages at query time → Generate an answer

Persistent wiki workflow: Ingest source → Extract and compile into structured pages → Query maintained wiki → Crystallize valuable answers back into wiki → Knowledge can compound

The wiki is a persistent, compounding artifact. The documented workflows guide agents through cross-reference maintenance, contradiction review, and synthesis updates while humans retain responsibility for curation and quality review.


Quick Start

Option A: Scaffold a workspace with the alpha CLI

From a checkout of this repository:

python -m pip install -e .
syntheca init my-wiki
syntheca init my-wiki-preview --runtimes codex --dry-run
syntheca init my-example-wiki --runtimes codex --example
syntheca doctor my-wiki
syntheca doctor my-wiki --fix
syntheca inspect my-wiki
syntheca lint my-wiki
syntheca lint my-wiki --fix
syntheca upgrade my-wiki --dry-run
syntheca migrate --source ./my-vault --mode dry-run \
  --output-report ./migration-report.md \
  --output-manifest ./migration-manifest.json

Run the test suite with coverage:

python -m pip install -e '.[dev]'
python -m coverage run -m unittest
python -m coverage report

This creates an LLM-ready workspace with README.md, AGENTS.md, optional runtime entrypoints and provider-native skills, raw/, wiki/, and .syntheca/syntheca.yaml.

Human-facing CLI output prioritizes status, summary, and next actions. Add --verbose to doctor, lint, or upgrade --dry-run when you need individual checks or planned file changes. Use --json for machine-readable automation output and --quiet when automation only needs an exit code.

Project status: Alpha CLI available (init, init --dry-run, init --example, doctor, doctor --fix, inspect, lint, lint --fix, upgrade --dry-run, migrate --mode dry-run). Write operations remain narrow: doctor --fix restores missing scaffold/runtime files without overwriting existing files, and lint --fix is limited to mechanical index/frontmatter fixes that do not require LLM judgment. Output/apply migration modes and MCP server work are planned (not implemented in the current alpha CLI).

Option B: Explore the schema-first starter

1. Clone the repository

git clone https://github.com/swj9707/syntheca.git
cd syntheca

Stay at the repository root so your agent can read schema/ and work with the example knowledge base under starter/.

Directory structure:

starter/
├── raw/                    # Your source material (immutable)
│   └── sample-source.md
└── wiki/                   # Agent-maintained wiki
    ├── sources/            # Processed source pages
    ├── entities/           # Named objects (people, projects, products)
    ├── concepts/           # Reusable ideas and patterns
    ├── syntheses/          # Cross-source analysis
    ├── decisions/          # Important choices (ADR pattern)
    ├── unclassified/       # Ambiguous or legacy content
    ├── index.md            # Content catalog
    └── log.md              # Chronological operation record

2. Load the schema into your agent

Claude Code:

Read schema/AGENTS.md and schema/adapters/claude/CLAUDE.md

Codex:

Read schema/AGENTS.md and schema/adapters/codex/AGENTS.md

OpenCode:

Read schema/AGENTS.md and schema/adapters/opencode/AGENTS.md

If you maintain a local OpenCode skill for Syntheca, you can load it before reading the schema. The v0.1 repository does not ship a skill package.

3. Ingest a source

Add your source to starter/raw/, then tell the agent:

Use starter/ as the wiki root.
Ingest starter/raw/my-article.md into starter/wiki/.

Ask the agent to follow the documented ingest workflow:

  1. Read the source
  2. Create a source page with summary and key points
  3. Extract entities (people, projects, products) → create entity pages
  4. Extract concepts (ideas, patterns) → create concept pages
  5. Add evidence-grounded cross-links and review whether reciprocal links are appropriate
  6. Update index.md and log.md

4. Query the wiki

What does the wiki say about [topic]?

The documented query workflow instructs the agent to search the wiki, synthesize an answer with citations, and evaluate whether the answer should be crystallized back into the wiki as a synthesis page.

5. Explore the starter

Open starter/wiki/index.md to see:

  • 2 source pages demonstrating ingest and cross-source synthesis
  • 1 entity page (Syntheca framework)
  • 3 concept pages (persistent wiki, crystallization, and three-layer architecture)
  • 1 synthesis page (retrieval, persistent wiki, and hybrid comparison)

All pages are cross-referenced and follow the schema. Run the read-only v0.1 checklist in docs/guides/starter-lint.md after changing the starter wiki. See docs/concepts/framework.md for the framework overview and docs/project/release-checklist.md for the public release gate.


Architecture

Three Layers

  1. Raw Sources (raw/): Immutable source material. Agents read from here but never modify.

  2. Wiki (wiki/): Agent-maintained structured pages. Agents create, update, and cross-link pages following the schema.

  3. Schema (schema/): Instructions defining how the wiki is maintained. Templates, workflows, and runtime adapters.

Page Types

Type Purpose Example
Source Summarize raw sources, identify extraction candidates sources/article-2026-05-30.md
Entity Named objects: people, projects, products, libraries entities/syntheca.md
Concept Reusable ideas, patterns, methods concepts/persistent-wiki.md
Synthesis Cross-source analysis, comparisons, crystallized insights syntheses/wiki-vs-rag.md
Decision Important choices following ADR pattern decisions/use-markdown.md
Unclassified Ambiguous or legacy content, safe preservation unclassified/legacy-note.md

Core Workflows

Ingest (schema/workflows/ingest.md): Process raw sources → create/update pages → maintain cross-references

Query (schema/workflows/query.md): Search wiki → synthesize answer → evaluate crystallization

Lint (schema/workflows/lint.md): Health-check for orphans, broken links, contradictions, stale claims

Crystallization (schema/workflows/crystallization.md): File valuable query answers back into the wiki as synthesis/concept/decision pages

Migration (schema/workflows/migration.md): Guide an agent through safe imports with dry-run, classification heuristics, and unknown field preservation


Migration from Existing Wikis

The v0.1 agent-guided migration workflow is designed for:

  • Obsidian vaults
  • Notion exports
  • Legacy markdown wikis
  • Flat collections of notes

Process:

  1. Dry-run first: Generate report without modifying files
  2. Review classification: Check page type assignments (source/entity/concept/synthesis/decision/unclassified)
  3. Apply with explicit approval: Ask an agent to output to a new directory (safer) or modify files in place (destructive)
  4. Preserve unknowns: Custom frontmatter fields are never deleted

Alpha migration dry-run CLI:

syntheca migrate --source ./my-vault --mode dry-run \
  --output-report ./migration-report.md \
  --output-manifest ./migration-manifest.json

Output/apply/in-place migration modes are planned (not implemented in the current alpha CLI).

See docs/guides/migration.md for the full guide with examples, docs/guides/cli-reference.md for command details, and docs/guides/configuration.md for generated workspace settings.


Runtime Compatibility

Syntheca is designed to be runtime-neutral. The canonical schema (schema/AGENTS.md) and adapter notes describe how to apply the framework across different agent systems; v0.1 does not claim tested, automated support for every runtime.

Runtime Adapter Focus Relevant Runtime Capabilities
Claude Code Conversational schema application Runtime-specific capabilities vary by setup
Codex Repository-oriented schema application CLI, IDE, cloud/web, app; optional GitHub and SDK integration
OpenCode Multi-agent schema application Delegation, background execution, and skills depend on runtime configuration
MCP Generic Optional external integration path Capabilities depend on the selected MCP servers

See docs/concepts/capability-matrix.md for detailed comparison.

Adapter Documents

  • Claude Code: schema/adapters/claude/CLAUDE.md
  • Codex: schema/adapters/codex/AGENTS.md
  • OpenCode: schema/adapters/opencode/AGENTS.md

Each adapter explains how to apply the canonical schema in that runtime without redefining policy.


Use Cases

Personal Research

Track papers, articles, and notes over weeks or months. Maintained pages can build a more comprehensive picture as sources accumulate and are reviewed.

Reading Companion

Use the documented workflow to file chapter summaries, character pages, and theme pages as you read. The wiki can grow into a cross-referenced reading companion with agent assistance and human review.

Team Knowledge Base

Use meeting transcripts, project documents, and customer notes as curated inputs. The documented workflow guides an agent to maintain project entities, recurring concepts, and reviewed decisions.

Competitive Analysis

Track competitors, products, and market trends. The documented workflow guides agents to consolidate sources and surface potential contradictions for review.

Course Notes, Trip Planning, Hobby Deep-Dives

Anything where you accumulate knowledge over time and want it organized, not scattered.


Schema Highlights

Frontmatter Standards

Every page has structured metadata:

---
title: "Page Title"
type: concept
status: active
created: "2026-05-30"
updated: "2026-05-30"
sources: ["sources/article.md"]
related: ["concepts/other-concept.md"]
tags: [tag1, tag2]
---

See schema/frontmatter.md for canonical field definitions per page type.

Template Structure

All templates follow consistent pattern:

  1. Frontmatter: Structured metadata
  2. Required sections: Definition, explanation, relationships, sources
  3. Optional sections: Examples, counterpoints, open questions, contradictions
  4. Authoring rules: Guidelines for agent behavior
  5. Minimum completion criteria: Validation checklist

Quality Standards

Adopted from LLM Wiki v2:

  • Confidence scoring groundwork: Synthesis pages may record a manually assigned page-level evidence score
  • Contradiction handling: Explicit sections for conflicting claims
  • Uncertainty preservation: Limitations and open questions are required
  • Evidence grounding: Claims must cite sources
  • Crystallization threshold: "Would I want to read this 6 months from now?"

Roadmap

v0.1 (Current)

  • ✅ Core schema and templates
  • ✅ Five workflows (ingest, query, lint, crystallization, migration)
  • ✅ Runtime adapter notes (Claude Code, Codex, OpenCode)
  • ✅ Starter examples
  • ✅ Agent-guided migration specification with dry-run

v0.2 alpha (In progress)

  • CLI package skeleton
  • syntheca init workspace scaffold, dry-run preview, and example workspace
  • syntheca doctor structural validation and missing-file repair
  • syntheca inspect deterministic workspace summary
  • syntheca lint deterministic wiki checks
  • syntheca lint --fix narrow mechanical fixes
  • syntheca migrate --mode dry-run report and manifest generation
  • syntheca upgrade --dry-run scaffold sync planning
  • Machine-readable baseline checks for required type/status, links, raw source paths, and index coverage
  • Extended frontmatter validation beyond the deterministic baseline
  • Migration output/apply modes
  • Initial MCP interface design

v0.3 (Future)

  • MCP server implementation
  • Vector and hybrid search integrations
  • Link graph visualization
  • Extended stale-page and supersession checks
  • Claim-level confidence and contradiction review experiments
  • Multi-wiki federation

See docs/project/roadmap.md for detailed feature plans.


Project Status

Current: Pre-release (v0.1 schema and documentation framework plus v0.2 alpha CLI init/doctor/inspect/lint/lint-fix/upgrade dry-run/migrate dry-run commands)

Not ready for:

  • Production team wikis without human review
  • Large wikis without separately evaluated search and review infrastructure
  • Automated workflows without supervision

Suitable for exploratory evaluation:

  • Small personal research wiki experiments
  • Trying the documented workflows with Claude Code, Codex, or OpenCode
  • Creating an experimental workspace with syntheca init
  • Agent-guided migration experiments (dry-run workflow)
  • Schema evaluation and feedback

Contributing

See CONTRIBUTING.md for guidelines.

High-priority contributions:

  • Runtime adapter notes and evaluation reports for other agents
  • Production experience reports (what works, what breaks)
  • Planned migration tooling design and implementation
  • Template refinements based on usage

Attribution

Syntheca is derived from:

  • Andrej Karpathy's LLM Wiki: Original persistent wiki pattern, three-layer architecture, core operations
  • agentmemory (rohitg00 et al.): Production lessons, confidence scoring, quality standards

See REFERENCES.md and WIKI-ATTRIBUTION.md for detailed attribution.

Syntheca's contribution: formalization, cross-runtime compatibility, an agent-guided migration specification, and concrete templates/workflows.


License

MIT License. See LICENSE for details.


Links

  • Documentation hub: docs/README.md
  • 한국어 문서: README.ko.md and docs/ko/
  • Schema: schema/
  • Starter examples: starter/
  • CLI reference: docs/guides/cli-reference.md
  • Configuration reference: docs/guides/configuration.md
  • Migration guide: docs/guides/migration.md
  • Capability matrix: docs/concepts/capability-matrix.md
  • References: REFERENCES.md

Questions?

  • How is this different from RAG? See starter/wiki/syntheses/persistent-wiki-vs-rag.md
  • How do I migrate my Obsidian vault? See docs/guides/migration.md and docs/ko/migration.md
  • Which runtime should I use? See docs/concepts/capability-matrix.md
  • What are the page types? See schema/page-types.md
  • What are the templates? See schema/templates/*.md

Start small, let it compound.

The schema guides agents through bookkeeping tasks while you curate sources and review quality. Over time, the wiki can become a richer, cross-referenced knowledge base.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syntheca-0.2.0a12.tar.gz (46.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

syntheca-0.2.0a12-py3-none-any.whl (49.5 kB view details)

Uploaded Python 3

File details

Details for the file syntheca-0.2.0a12.tar.gz.

File metadata

  • Download URL: syntheca-0.2.0a12.tar.gz
  • Upload date:
  • Size: 46.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for syntheca-0.2.0a12.tar.gz
Algorithm Hash digest
SHA256 14ba2b204a9af6cbed4eb460f538e2355b52b6a79dc4d8fb868af154e7dae122
MD5 d45034c2de150fb12a335668df73ade9
BLAKE2b-256 162df283a9a93d49581521c14036d9c733c9581c1160a4e94d2a76ec5e74dcdb

See more details on using hashes here.

File details

Details for the file syntheca-0.2.0a12-py3-none-any.whl.

File metadata

  • Download URL: syntheca-0.2.0a12-py3-none-any.whl
  • Upload date:
  • Size: 49.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for syntheca-0.2.0a12-py3-none-any.whl
Algorithm Hash digest
SHA256 bc813b786301ab37c08d8b412400378b5e30008cf96095a988ab3b2e47e6e542
MD5 cede4b086fe564ac17e9829dd3f59856
BLAKE2b-256 a325d8cf3c4d05f93f013be58811d5a3e4242953a5955a994df8539e9d441724

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page