The confluence of information into intelligence - A DSL compiler that transforms qualitative research annotations into canonical knowledge structures
Project description
Synesis
The confluence of evidence into auditable knowledge.
A Domain-Specific Language and toolchain for transforming qualitative research annotations into structured, validated, and fully traceable knowledge artifacts.
Copyright (c) 2011–2026 Christian Maciel de Britto
https://github.com/synesis-lang·ORCID
What is Synesis?
Qualitative research — literature reviews, grounded theory, case studies, systematic reviews — generates enormous interpretive work that is typically scattered across unstructured notes, spreadsheets, or proprietary software locked to a single vendor.
Synesis is a semantic compiler for analytical thinking: you write your interpretations in plain-text files with a clean declarative syntax, and the toolchain validates, structures, and exports them as canonical knowledge artifacts. Every concept is traceable to its source file, line, and column. No silent errors. No orphaned codes. No ambiguous relations.
The name comes from the Greek σύνεσις — the convergence of evidence fragments into an intelligible whole. In its biblical sense (Colossians 1:9, synesei pneumatikei), it denotes a form of understanding that connects rather than merely accumulates.
The core differentiator: validation happens at compile time, not at retrieval or graph construction. If a code does not exist in the declared ontology, the compiler rejects the annotation at the source — before any output is produced. This is the architectural distinction that separates Synesis from all known CAQDAS alternatives and RAG pipeline tooling.
The Ecosystem
📚 Zotero 🤖 synesis-coder
PDF annotations ──► AI-assisted annotation
(zotero-synesis-export) generates full .syn files
│ │
▼ ▼
📄 .syn / .synt / .syno / .synp (Synesis source files)
│
▼
⚙️ Synesis Compiler (this package)
LALR(1) parser · AST validator · multi-artifact exporter
│
┌────┴──────────────────────────────────┐
▼ ▼
🐍 Python API 📊 Structured outputs
synesis.load() JSON · CSV · Excel
to_dataframe() REFI-QDA · DOCX · Alpaca JSONL
│ │
▼ ▼
📓 Jupyter Notebooks 🕸️ Neo4j / Memgraph
data science · visualization graph queries via MCP
│
🧠 synesis-lsp ──► 🖥️ Synesis Explorer (VS Code)
Language Server real-time diagnostics · graph viewer
| Repository | Language | Role |
|---|---|---|
| synesis ← this | Python | Compiler, parser, validator, exporters, Python API |
| synesis-lsp | Python | Language Server — diagnostics, hover, completion, semantic tokens |
| synesis-explorer | JS/TS | VS Code extension — tree views, graph viewer, themes |
| zotero-synesis-export | JavaScript | Zotero 7 plugin — exports PDF highlights as plain .syn |
| synesis-graph | Python | Import compiled knowledge into Neo4j / Memgraph |
| synesis-coder | Python | AI-assisted annotation — generates fully coded .syn files |
Installation
pip install synesis
Requires Python 3.10+.
A Complete Example
references.bib
@article{smith2024,
author = {Smith, Jane},
title = {Understanding Community Resilience},
journal = {Journal of Social Research},
year = {2024},
volume = {12},
pages = {45--67}
}
template.synt — field schema and validation rules
SOURCE FIELDS
OPTIONAL description
END SOURCE FIELDS
FIELD description TYPE TEXT
SCOPE SOURCE
DESCRIPTION General context or summary of the data source
GUIDELINES
Summarize the source purpose in 1-2 sentences.
Do not add analytical interpretation.
END GUIDELINES
END FIELD
ITEM FIELDS
REQUIRED citation, note, code
END ITEM FIELDS
FIELD citation TYPE QUOTATION
SCOPE ITEM
DESCRIPTION Direct quote or selected excerpt from the data source
GUIDELINES
Extract a complete, self-contained excerpt of 1-3 sentences.
Preserve the original wording. Do not paraphrase.
END GUIDELINES
END FIELD
FIELD note TYPE MEMO
SCOPE ITEM
DESCRIPTION Analytical memo recording interpretations or causal reasoning
GUIDELINES
Explain the analytical significance in 1-3 sentences.
Distinguish textual evidence from your interpretation.
END GUIDELINES
END FIELD
FIELD code TYPE CODE
SCOPE ITEM
DESCRIPTION Ontology codes applied to this excerpt
GUIDELINES
Apply only codes supported by the excerpt.
Every code must have a corresponding ONTOLOGY entry.
END GUIDELINES
END FIELD
ONTOLOGY FIELDS
REQUIRED definition, group
END ONTOLOGY FIELDS
FIELD definition TYPE TEXT
SCOPE ONTOLOGY
DESCRIPTION Clear definition of the code with inclusion/exclusion criteria
END FIELD
FIELD group TYPE TOPIC
SCOPE ONTOLOGY
DESCRIPTION Broader thematic domain that groups related codes
END FIELD
annotations.syn — your research data
SOURCE @smith2024
description: Qualitative study on community resilience strategies in urban contexts.
END SOURCE
ITEM @smith2024
citation: "People here look out for each other. When the flood came, nobody waited
for official help — neighbors just organized themselves."
note: Participant describes spontaneous collective action as a primary resilience
mechanism, bypassing formal institutions. Suggests strong bonding social capital.
code: Social_Cohesion, Collective_Action
END ITEM
ontology.syno — controlled vocabulary
ONTOLOGY Social_Cohesion
definition: The degree to which community members trust, support, and cooperate
with one another. Applies when participants describe solidarity or mutual aid.
group: Community_Resilience
END ONTOLOGY
ONTOLOGY Collective_Action
definition: Coordinated efforts by community members to address shared challenges
without formal institutional direction.
group: Community_Resilience
END ONTOLOGY
project.synp — the entry point
PROJECT demo
TEMPLATE "template.synt"
INCLUDE BIBLIOGRAPHY "references.bib"
INCLUDE ANNOTATIONS "annotations.syn"
INCLUDE ONTOLOGY "ontology.syno"
END PROJECT
CLI
# Compile a project and generate all output artifacts
synesis compile project.synp --output results/
# Validate syntax and integrity without generating output
synesis check annotations.syn
# Validate template structure and consistency
synesis validate-template template.synt
# Show version and authorship
synesis --version
# Show full intellectual genealogy
synesis --credits
Python API
Compile entirely in-memory — no file I/O required:
import synesis
result = synesis.load(
project_content = open("project.synp").read(),
template_content = open("template.synt").read(),
annotation_contents = {"annotations.syn": open("annotations.syn").read()},
ontology_contents = {"ontology.syno": open("ontology.syno").read()},
bibliography_content = open("references.bib").read(),
)
if result.success:
# Export as pandas DataFrames
items_df = result.to_dataframe("items")
codes_df = result.to_dataframe("codes")
chains_df = result.to_dataframe("chains")
# Export as JSON
data = result.to_json_dict()
# Compilation stats
print(result.stats)
# CompilationStats(source_count=1, item_count=1, ontology_count=2, code_count=2)
else:
for diagnostic in result.get_diagnostics():
print(diagnostic)
Available tables: sources, items, ontologies, codes, chains.
Language Features
Sources & Items — Every annotation is traceable to a BibTeX reference. The compiler validates each @key against the bibliography at compile time.
Templates — Define field schemas with types (CODE, TEXT, CHAIN, SCALE, QUOTATION, MEMO...), validation rules (REQUIRED, OPTIONAL, FORBIDDEN), and constraints (ARITY, BUNDLE, VALUES). The template is the contract between the researcher and the compiler.
Ontologies — Controlled vocabularies validated at compile time. Every code must exist in the declared ontology — typos and orphaned concepts are caught immediately, at the source.
Chains — Causal or relational links: Trust -> ENABLES -> Acceptance. Validated against declared RELATIONS and ARITY constraints.
GUIDELINES — Instructional prose embedded in template field definitions, visible to human annotators and LLM coders, never parsed as code.
Deterministic multi-artifact emission — A single compilation pass produces JSON, CSV, Excel, REFI-QDA, DOCX, and Alpaca JSONL simultaneously. All-or-nothing: either every artifact is valid, or nothing is emitted.
File Types
| Extension | Purpose |
|---|---|
.syn |
Annotation files — sources and items |
.synp |
Project file — declares template, bibliography, includes |
.synt |
Template file — field schema and validation rules |
.syno |
Ontology file — controlled vocabulary of codes |
.bib |
BibTeX bibliography (standard format) |
Potential Applications
| Domain | How Synesis helps |
|---|---|
| Systematic literature reviews | Annotate hundreds of papers with a shared template; export clean datasets for meta-analysis |
| Grounded Theory / Thematic Analysis | Build and validate code systems with ontological constraints; trace every code to its source |
| Mixed-methods research | Bridge qualitative interpretation with quantitative formats for R or Python workflows |
| Knowledge graphs | Compile research findings into Neo4j; model causal chains as graph edges |
| AI-augmented analysis | Feed structured annotations as context to LLMs via MCP; responses traceable to source evidence |
| Biblical / exegetical studies | Code canonical texts with relational chains; integrate classical and patristic corpora |
| Longitudinal projects | Template versioning and strict validation prevent concept drift across research phases |
Architecture
synesis compile project.synp
│
Lark LALR(1) parser
│
AST Transformer
│
Semantic Validator ◄── ontology · bibliography · template contract
│
Exporters (single pass, all-or-nothing)
├── JSON
├── CSV
├── Excel
├── REFI-QDA
├── DOCX
└── Alpaca JSONL (fine-tuning datasets for open-weight LLMs)
The compiler exposes compile_string() for integration with synesis-lsp, enabling real-time diagnostics in the VS Code extension without spawning a subprocess.
VS Code Integration
The Synesis Explorer extension (requires synesis-lsp) provides:
- Real-time diagnostics — errors and warnings as you type
- Semantic syntax highlighting — AST-driven, not regex
- Tree explorers for References, Codes, Relations, and Ontology
- Go-to-definition, rename, and hover documentation
- Relation graph viewer (Mermaid → SVG)
- Abstract viewer with BibTeX highlights
- Synesis Dark and Light themes
Compatibility
| Package | Latest | Requires synesis | Python |
|---|---|---|---|
| synesis | 0.6.0 | — | ≥ 3.10 |
| synesis-coder | 0.4.1 | ≥ 0.5.5 | ≥ 3.10 |
| synesis-lsp | 0.16.0 | ≥ 0.5.5 | ≥ 3.10 |
| synesis-graph | 0.2.0 | ≥ 0.5.5 | ≥ 3.10 |
Intellectual Genealogy
Synesis is the formal culmination of a research and development trajectory spanning more than a decade. Its architecture, domain vocabulary, and methodological requirements emerged from successive implementations across qualitative research, professional consultancy, and biblical hermeneutics:
| Period | Work | Contribution |
|---|---|---|
| 2011–2013 | BDM — Banco de Dados Multimodal | First definition of: sources, items, factors, relations, ontology, knowledge graph as an integrated structure |
| 2016–2018 | SocioAtlas | CAQDAS ecosystem integrating annotations, audit trails, Zotero, and graph visualization |
| 2019–2020 | DSAP annotation pipeline | Professional validation of the corpus → item → summary → theme → score audit trail |
| 2022 | SocioAtlas para Google Sheets | Collaborative access; first attempt at systematic theological annotation in the same framework |
| 2024 | DGT7 | Text-file knowledge representation; exposed the need for formal, validatable syntax |
All prior works are authored by Christian Maciel de Britto. The NOTICE file in the repository records the formal copyright notices for each predecessor work.
License
MIT — see LICENSE.
The outputs generated by Synesis (compiled knowledge artifacts — JSON, CSV, Excel, REFI-QDA, DOCX, Alpaca JSONL) are not covered by this license. You retain full ownership of your research data and all compiled outputs.
A license change to AGPL-3.0-only (with Synesis Data Output Exception) is planned for an upcoming release. This will not affect existing users' right to use their compiled outputs.
Author
Dr. Christian Maciel de Britto Researcher · Software author · Knowledge engineer
"True σύνεσις — the convergence of evidence fragments into an intelligible, auditable, and technically rigorous whole."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file synesis-0.6.0.tar.gz.
File metadata
- Download URL: synesis-0.6.0.tar.gz
- Upload date:
- Size: 212.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff31322ebc624f19dd7b0a6809de7cc16b0f38cc3298c8576c678bfba37875b4
|
|
| MD5 |
aa74e1337e497b27723671d4a3383067
|
|
| BLAKE2b-256 |
81b6846da8d16cc70977e3386e2f38d9e10804ed81b2791198f098bba3c5c1bc
|
File details
Details for the file synesis-0.6.0-py3-none-any.whl.
File metadata
- Download URL: synesis-0.6.0-py3-none-any.whl
- Upload date:
- Size: 155.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
009b207c67d326a7ecac417d16d6e97238ac7bbf44a6eabad1671ffe35740561
|
|
| MD5 |
eb00cf177cb35ddca4bb4aad2b0d30c0
|
|
| BLAKE2b-256 |
d9b5326e1b6f04a665a47df1d8ab3894d36d7893707e39a05895db11af20e82f
|