Ontology-aligned middleware between agents and graph databases
Project description
SEOCHO
Ontology-aligned middleware between your agents and your graph database.
You declare the ontology. You call add() and ask().
SEOCHO keeps graph writes, semantic artifacts, and agent behavior aligned
to that one schema contract across local SDK and runtime paths.
flowchart LR
D["๐ Your docs"] --> E["Extraction"]
O{{"๐งฌ Ontology<br/>your schema"}} -.governs.-> E
O -.governs.-> V
E --> V["Validate<br/>+ readiness gate"]
V --> G[("Graph<br/>LadybugDB / DozerDB")]
G --> A["Ontology-grounded<br/>answers"]
style O fill:#fef3c7,stroke:#f59e0b,stroke-width:2px
SEOCHO is a fit when:
- you need extraction, Cypher generation, and answers to stay in-schema
- you want one ontology to drive SDK, runtime, and graph contracts together
- you need files, artifacts, and traces to stay visible instead of disappearing behind a managed memory black box
Start here:
| If you want to... | Go here |
|---|---|
| get a first local success path | Quickstart |
| follow a runnable notebook walkthrough | examples/quickstart.ipynb |
| understand SEOCHO with a guided beginner walkthrough | Beginner Guide |
| see a runnable usecase demo | Usecases |
| bring your own ontology and files | Apply Your Data |
| use the Python SDK directly | Python SDK Quickstart |
| declare graph-model-aware indexing in YAML | Indexing Design Specs |
| inspect files, artifacts, and traces | Files and Artifacts |
| understand the system design | Architecture Deep Dive |
| present the product and architecture | Overview Deep-Dive Deck |
Quick Start
uv pip install "seocho[local]" # zero-config local SDK, embedded LadybugDB by default
# or: uv pip install "seocho[embedded]" # minimal embedded graph path
from seocho import Seocho, Ontology, NodeDef, RelDef, Property
# 1. Define your schema
ontology = Ontology(
name="my_domain",
nodes={
"Person": NodeDef(properties={"name": Property(str, unique=True)}),
"Company": NodeDef(properties={"name": Property(str, unique=True)}),
},
relationships={
"WORKS_AT": RelDef(source="Person", target="Company"),
},
)
# 2. Zero-config local client โ uses embedded LadybugDB, no server needed
s = Seocho.local(ontology)
# 3. Index
s.add("Marie Curie worked at the University of Paris.")
# 4. Query
print(s.ask("Where did Marie Curie work?"))
Remote runtime client:
from seocho import Seocho
client = Seocho.remote("http://localhost:8001")
print(client.ask("What do we know about ACME?"))
client.ask(...) above is the HTTP chat convenience surface. It is not the
same execution engine as runtime client.react(...) or client.advanced(...).
Run the local platform stack:
make setup-env
make up
Install Paths
| Path | Install | What else you need |
|---|---|---|
| HTTP client mode | pip install seocho |
a running SEOCHO runtime (base_url=...) |
| Local SDK engine | pip install "seocho[local]" |
provider credentials; Neo4j/DozerDB only if you pass a Bolt URI |
| Repository development | pip install -e ".[dev]" |
local clone + test/tooling deps |
| Offline ontology governance | pip install "seocho[ontology]" |
local ontology files only |
pip install seochois intentionally thin โ enough for HTTP client mode.Seocho.local(ontology)defaults to embedded LadybugDB at.seocho/local.lbug.- DozerDB/Neo4j is the production graph path: pass
graph="bolt://..."or constructNeo4jGraphStore(...)explicitly. - The fastest full local stack is
make setup-env && make up. examples/quickstart.ipynbreads provider keys from.env, stays on LadybugDB by default, and switches to Bolt-backed Neo4j/DozerDB only when bothNEO4J_URIandNEO4J_PASSWORDare set.
Execution Surfaces
The same Seocho facade exposes different execution engines. This is the
single most important thing to understand before benchmarking or comparing
providers.
| Surface | Where it runs | What it actually does | Tool use |
|---|---|---|---|
Seocho.local(...).ask(...) |
in-process local SDK | ontology-aware local query + answer synthesis | no runtime agent loop |
Seocho(base_url=...).ask(...) |
HTTP runtime | /api/chat memory/chat convenience endpoint |
not the explicit react/debate path |
client.semantic(...) |
HTTP runtime | deterministic semantic graph QA with optional bounded repair | no agentic tool loop |
client.react(...) |
HTTP runtime | router agent path backed by the Agents runtime | yes |
client.advanced(...) / client.debate(...) |
HTTP runtime | multi-agent debate with semantic preflight + supervisor synthesis | yes |
If you want provider-native reasoning and tool-use comparisons, use
client.react(...) or client.advanced(...) against a running runtime. Do not
use local ask() as that benchmark target.
Why SEOCHO
Built for graph-native teams that need a stronger contract between ontology, runtime, and agent behavior.
- ontology-first, not prompt-first
- graph-native, not vector-only
- schemaless property graph plus agent-visible semantic overlay
- governed artifacts, not ad hoc schema drift
- local SDK authoring and runtime consumption on one contract
Architecture Overview
Two planes share one ontology:
- Data Plane (
seocho/index/) โ files โ extraction โ validation โ graph write - Control Plane (
seocho/query/) โ ontology โ prompt strategy โ Cypher โ answer synthesis - Ontology (
seocho/ontology.py) โ single source of truth for both planes, and for the runtime artifact contract
The Seocho class is a thin public facade. Canonical engine logic lives under
seocho/local_engine.py, seocho/client_remote.py, and seocho/client_bundle.py
so the facade stays small. Runtime transport is runtime/agent_server.py;
shared runtime composition lives in runtime/server_runtime.py.
For the full story โ control plane vs data plane, internal orchestration seams
(DomainEvent, IngestionFacade, QueryProxy, AgentFactory,
AgentStateMachine), and the staged extraction/ โ runtime/ migration โ
see docs/ARCHITECTURE.md and
docs/RUNTIME_PACKAGE_MIGRATION.md.
Choose Your Runtime Shape
| Mode | Constructor | Best for |
|---|---|---|
| HTTP client | Seocho(base_url="http://localhost:8001", workspace_id="default") |
consume an existing runtime over HTTP |
| Embedded local | Seocho.local(ontology) |
serverless hello world, SDK authoring, experiments |
| Explicit local engine | Seocho(ontology=..., graph_store=..., llm=...) |
direct graph-store control |
| Local platform runtime | make up or seocho serve |
UI + API + DozerDB on one machine |
Core parameters you will hit early:
base_urlโ remote SEOCHO runtime root for HTTP client modeworkspace_idโ logical scope passed through runtime-facing requestsgraph_storeโ explicit graph store for local engine modereasoning_mode+repair_budgetโ bounded semantic repair loop for hard questionsmax_stepsโ runtime agent turn limit forreact/debatetool_budgetโ runtime tool-call budget forreact/debate
For production local engine, Neo4jGraphStore works against both Neo4j and DozerDB over Bolt:
from seocho.store import Neo4jGraphStore
store = Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password")
Common Use Cases
1. Consume an existing SEOCHO runtime over HTTP
from seocho import Seocho
client = Seocho(base_url="http://localhost:8001", workspace_id="default")
print(client.ask("What do we know about ACME?"))
Use ask() here as a convenience chat surface. When you need explicit runtime
graph QA or agentic behavior, call client.semantic(...), client.react(...),
or client.advanced(...) directly.
2. Build locally against your own ontology with no graph server
from seocho import Seocho, Ontology
client = Seocho.local(Ontology.from_jsonld("schema.jsonld"))
client.add("ACME acquired Beta in 2024.")
print(client.ask("Who did ACME acquire?", reasoning_mode=True, repair_budget=2))
3. Build locally against a production graph server
from seocho import Seocho, Ontology
from seocho.store import Neo4jGraphStore, OpenAIBackend
client = Seocho(
ontology=Ontology.from_jsonld("schema.jsonld"),
graph_store=Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password"),
llm=OpenAIBackend(model="gpt-4o-mini"),
workspace_id="default",
)
client.add("ACME acquired Beta in 2024.")
print(client.ask("Who did ACME acquire?", reasoning_mode=True, repair_budget=2))
4. Promote the same ontology into runtime artifacts
artifacts = client.approved_artifacts_from_ontology()
prompt_context = client.prompt_context_from_ontology(
instructions=["Prefer finance ontology labels and relationships."]
)
draft = client.artifact_draft_from_ontology(name="finance_core_v1")
5. Run the local platform stack with UI + API + graph DB
make setup-env
make up
- UI:
http://localhost:8501 - API docs:
http://localhost:8001/docs - DozerDB browser:
http://localhost:7474
See docs/FILES_AND_ARTIFACTS.md for where
schema.jsonld, graph data, rule profiles, semantic artifacts, and traces live.
What the Ontology Controls
| Stage | What happens |
|---|---|
| Extraction | Entity types + relationships in LLM prompt |
| Querying | Schema-aware Cypher generation and repair prompts |
| Validation | SHACL shapes derived โ catches type/cardinality errors |
| Constraints | UNIQUE/INDEX generated from ontology, applied to Neo4j |
| Denormalization | Cardinality rules determine safe flattening |
| Glossary | SKOS-style vocabulary terms, aliases, and hidden labels compiled into the ontology context identity |
| Reasoning | Optional low-quality retry re-extracts with ontology guidance |
| Runtime parity | Same ontology can be converted into approved semantic artifacts and typed prompt context |
| Agent context | Stable ontology context hash follows indexing, graph writes, query traces, and agent hand-off metadata |
Local SDK writes persist compact _ontology_* graph properties on nodes and
relationships. Queries and agent tools compare the active ontology context
hash with hashes in the graph and surface any mismatch as ontology_context_mismatch
in trace/tool metadata โ a guardrail that signals when a graph may need
re-indexing under a new ontology profile.
Key Features
# Index a directory (supports .txt, .md, .csv, .json, .jsonl, .pdf)
s.index_directory("./my_data/")
# Category-aware extraction (8 filing-domain presets)
s.add(text, category="Financials")
# Query with reasoning mode
s.ask("question", reasoning_mode=True, repair_budget=2)
# Swappable LLM providers (OpenAI, DeepSeek, Kimi, Grok, Qwen)
from seocho.store import OpenAIBackend, DeepSeekBackend
llm = OpenAIBackend(model="gpt-4o-mini")
# Agent session โ context persists across add/ask within one session
with s.session("my_analysis") as sess:
sess.add("ACME acquired Beta in 2024.")
sess.add("Beta provides risk analytics to ACME.")
answer = sess.ask("What does ACME own or use?")
# Schema as code (JSON-LD canonical storage + SHACL export)
ontology.to_jsonld("schema.jsonld")
ontology = Ontology.from_jsonld("schema.jsonld")
# Ontology merge + diff (for migration)
combined = finance_onto.merge(legal_onto)
For the rest โ experiment workbench, tracing backends, supervisor + hand-off config, offline governance CLI, multi-ontology per database โ see seocho.blog/sdk.
SDK Package Structure
seocho/
โโโ index/ โ Data Plane: putting data IN
โ โโโ pipeline.py โ chunk โ extract โ validate โ rule inference โ write
โ โโโ linker.py โ embedding-based entity relatedness
โ โโโ file_reader.py โ .txt/.md/.csv/.json/.jsonl/.pdf
โโโ query/ โ Control Plane: getting data OUT
โ โโโ strategy.py โ ontology โ LLM prompt generation (cached)
โ โโโ cypher_builder.py โ deterministic Cypher from intent
โโโ store/ โ Storage backends
โ โโโ graph.py โ Neo4j/DozerDB + LadybugDB
โ โโโ vector.py โ FAISS / LanceDB
โ โโโ llm.py โ OpenAI, DeepSeek, Kimi, Grok, Qwen
โโโ rules.py โ SHACL-like rule inference + validation
โโโ ontology.py โ Schema: JSON-LD + SHACL + merge + migration
โโโ session.py โ Agent session: context cache + hand-off
โโโ agents.py โ IndexingAgent / QueryAgent / Supervisor
โโโ local_engine.py โ Local-mode orchestration behind the SDK facade
โโโ client_remote.py โ HTTP transport behind the facade
โโโ client_bundle.py โ Runtime-bundle glue behind the facade
โโโ client.py โ Public SDK facade
Three Ways to Use
Python SDK
from seocho import Seocho, Ontology, NodeDef, P
CLI
seocho init # create ontology interactively
seocho index ./data/ # index files
seocho ask "your question" # query
seocho status # graph stats
Jupyter Notebook
examples/quickstart.ipynb
examples/bring_your_data.ipynb
examples/finance-compliance/quickstart.py
LPG and RDF Support
# LPG (default) โ Cypher queries
onto = Ontology(name="finance", graph_model="lpg", ...)
# RDF โ n10s Cypher (DozerDB + neosemantics)
onto = Ontology(name="fibo", graph_model="rdf",
namespace="https://spec.edmcouncil.org/fibo/", ...)
Documentation
| Doc | Description |
|---|---|
| seocho.blog | Full documentation site |
| SDK Overview | SDK features and quick start |
| Ontology Guide | Schema design, JSON-LD, SHACL |
| API Reference | Complete method reference |
| docs/USECASES.md | Runnable usecase demos |
| docs/BEGINNER_GUIDE.md | Guided first-run path with architecture snippets |
| docs/ARCHITECTURE.md | System architecture |
| docs/presentations/SEOCHO_OVERVIEW_DEEP_DIVE.md | Beginner-friendly architecture deck |
| docs/FILES_AND_ARTIFACTS.md | Where ontology, rule, trace, and runtime files live |
| docs/BENCHMARKS.md | Private finance corpus and GraphRAG-Bench evaluation tracks |
| docs/WORKFLOW.md | Operational workflow |
| docs/ISSUE_TASK_SYSTEM.md | Sprint/task governance |
| CONTRIBUTING.md | How to contribute |
Observability
Pluggable tracing backends selectable at runtime or via SEOCHO_TRACE_BACKEND:
noneโ no tracing; smallest surfaceconsoleโ ephemeral stdout for local devjsonlโ canonical neutral trace artifact; file-based retentionopikโ optional exporter (hosted or self-hosted);SEOCHO_TRACE_OPIK_MODE=self_hostfor private infra
Sensitive workloads: prefer none or jsonl. Prompts, retrieval evidence,
and metadata may appear in traces โ route remote exporters through your
governance review. More detail at
docs/FILES_AND_ARTIFACTS.md.
Server Mode (Platform Operators)
For the full platform with multi-agent debate, web UI, and Docker services:
make setup-env && make up
# UI: http://localhost:8501
# API: http://localhost:8001/docs
# DozerDB: http://localhost:7474
Default make up starts the core local stack: neo4j, extraction-service,
evaluation-interface. The legacy semantic-service is opt-in:
docker compose --profile legacy-semantic up -d semantic-service
Scheduled Codex workflows skip cleanly when OPENAI_API_KEY /
SEOCHO_GITHUB_APP_ID / SEOCHO_GITHUB_APP_PRIVATE_KEY are unset.
Basic CI remains the required repository check surface.
See docs/QUICKSTART.md for the full server setup guide.
Contributing
git clone git@github.com:tteon/seocho.git && cd seocho
pip install -e ".[dev]"
scripts/pm/install-git-hooks.sh
python -m pytest seocho/tests/ -q
Pick a usecase to build around: docs/USECASES.md. Full guide in CONTRIBUTING.md.
License
MIT โ see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seocho-0.4.0.tar.gz.
File metadata
- Download URL: seocho-0.4.0.tar.gz
- Upload date:
- Size: 398.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4dcfb0274d272ec945046b686d2a6b170deb56677117592c39f6858050cba34e
|
|
| MD5 |
25f5a6fa542d90758467429b8a7e6382
|
|
| BLAKE2b-256 |
31521fbecc210ec477ffc94ed3525d8a7b4e02294e4e552e5e3869df5e517669
|
File details
Details for the file seocho-0.4.0-py3-none-any.whl.
File metadata
- Download URL: seocho-0.4.0-py3-none-any.whl
- Upload date:
- Size: 453.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
802b85975d642b12f887c6ef2169efb1e1201360ee65fb5850e32010789bdc6a
|
|
| MD5 |
7f94208e696d64a408ed705ea9e9b852
|
|
| BLAKE2b-256 |
9d118465434dedc68d0042224164618c03ced5ef56a89e8a92ab50cb72402ca3
|