Skip to main content

Ontology-aligned middleware between agents and graph databases

Project description

SEOCHO

Ontology-aligned middleware between your agents and your graph database.

PyPI License: MIT Docs Ask DeepWiki Quickstart Examples

You declare the ontology. You call add() and ask(). SEOCHO keeps graph writes, semantic artifacts, and agent behavior aligned to that one schema contract across local SDK and runtime paths.

flowchart LR
    D["๐Ÿ“„ Your docs"] --> E["Extraction"]
    O{{"๐Ÿงฌ Ontology<br/>your schema"}} -.governs.-> E
    O -.governs.-> V
    E --> V["Validate<br/>+ readiness gate"]
    V --> G[("Graph<br/>LadybugDB / DozerDB")]
    G --> A["Ontology-grounded<br/>answers"]
    style O fill:#fef3c7,stroke:#f59e0b,stroke-width:2px

SEOCHO is a fit when:

  • you need extraction, Cypher generation, and answers to stay in-schema
  • you want one ontology to drive SDK, runtime, and graph contracts together
  • you need files, artifacts, and traces to stay visible instead of disappearing behind a managed memory black box

Start here:

If you want to... Go here
get a first local success path Quickstart
see a runnable usecase demo Usecases
bring your own ontology and files Apply Your Data
use the Python SDK directly Python SDK Quickstart
declare graph-model-aware indexing in YAML Indexing Design Specs
inspect files, artifacts, and traces Files and Artifacts
understand the system design Architecture Deep Dive

Quick Start

uv pip install "seocho[local]"       # zero-config local SDK, embedded LadybugDB by default
# or: uv pip install "seocho[embedded]" # minimal embedded graph path
from seocho import Seocho, Ontology, NodeDef, RelDef, Property

# 1. Define your schema
ontology = Ontology(
    name="my_domain",
    nodes={
        "Person":  NodeDef(properties={"name": Property(str, unique=True)}),
        "Company": NodeDef(properties={"name": Property(str, unique=True)}),
    },
    relationships={
        "WORKS_AT": RelDef(source="Person", target="Company"),
    },
)

# 2. Zero-config local client โ€” uses embedded LadybugDB, no server needed
s = Seocho.local(ontology)

# 3. Index
s.add("Marie Curie worked at the University of Paris.")

# 4. Query
print(s.ask("Where did Marie Curie work?"))

Remote runtime client:

from seocho import Seocho

client = Seocho.remote("http://localhost:8001")
print(client.ask("What do we know about ACME?"))

Run the local platform stack:

make setup-env
make up

Install Paths

Path Install What else you need
HTTP client mode pip install seocho a running SEOCHO runtime (base_url=...)
Local SDK engine pip install "seocho[local]" provider credentials; Neo4j/DozerDB only if you pass a Bolt URI
Repository development pip install -e ".[dev]" local clone + test/tooling deps
Offline ontology governance pip install "seocho[ontology]" local ontology files only
  • pip install seocho is intentionally thin โ€” enough for HTTP client mode.
  • Seocho.local(ontology) defaults to embedded LadybugDB at .seocho/local.lbug.
  • DozerDB/Neo4j is the production graph path: pass graph="bolt://..." or construct Neo4jGraphStore(...) explicitly.
  • The fastest full local stack is make setup-env && make up.

Why SEOCHO

Built for graph-native teams that need a stronger contract between ontology, runtime, and agent behavior.

  • ontology-first, not prompt-first
  • edit schema.jsonld, not hidden prompts
  • graph-native, not vector-only
  • schemaless property graph plus agent-visible semantic overlay
  • governed artifacts, not ad hoc schema drift
  • user-owned semantic control plane across indexing, query, and runtime

Architecture Overview

One semantic control plane governs two execution planes:

  • Semantic Control Plane (seocho/ontology*, design specs, runtime artifacts) โ€” compile user ontology into a reusable semantic package
  • Data Plane (seocho/index/) โ€” files โ†’ extraction โ†’ validation โ†’ graph write
  • Query/Agent Plane (seocho/query/, runtime/*) โ€” intent โ†’ retrieval โ†’ tool use โ†’ answer synthesis

The Seocho class is a thin public facade. Canonical engine logic lives under seocho/local_engine.py, seocho/client_remote.py, and seocho/client_bundle.py so the facade stays small. Runtime transport is runtime/agent_server.py; shared runtime composition lives in runtime/server_runtime.py.

For the full story โ€” semantic control plane, internal orchestration seams (DomainEvent, IngestionFacade, QueryProxy, AgentFactory, AgentStateMachine), and the staged extraction/ โ†’ runtime/ migration โ€” see docs/ARCHITECTURE.md, docs/SEMANTIC_CONTROL_PLANE.md, and docs/RUNTIME_PACKAGE_MIGRATION.md.

Choose Your Runtime Shape

Mode Constructor Best for
HTTP client Seocho(base_url="http://localhost:8001", workspace_id="default") consume an existing runtime over HTTP
Embedded local Seocho.local(ontology) serverless hello world, SDK authoring, experiments
Explicit local engine Seocho(ontology=..., graph_store=..., llm=...) direct graph-store control
Local platform runtime make up or seocho serve UI + API + DozerDB on one machine

Core parameters you will hit early:

  • base_url โ€” remote SEOCHO runtime root for HTTP client mode
  • workspace_id โ€” logical scope passed through runtime-facing requests
  • graph_store โ€” explicit graph store for local engine mode
  • reasoning_mode + repair_budget โ€” bounded semantic repair loop for hard questions

For production local engine, Neo4jGraphStore works against both Neo4j and DozerDB over Bolt:

from seocho.store import Neo4jGraphStore

store = Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password")

Common Use Cases

1. Consume an existing SEOCHO runtime over HTTP

from seocho import Seocho

client = Seocho(base_url="http://localhost:8001", workspace_id="default")
print(client.ask("What do we know about ACME?"))

2. Build locally against your own ontology with no graph server

from seocho import Seocho, Ontology

client = Seocho.local(Ontology.from_jsonld("schema.jsonld"))
client.add("ACME acquired Beta in 2024.")
print(client.ask("Who did ACME acquire?", reasoning_mode=True, repair_budget=2))

3. Build locally against a production graph server

from seocho import Seocho, Ontology
from seocho.store import Neo4jGraphStore, OpenAIBackend

client = Seocho(
    ontology=Ontology.from_jsonld("schema.jsonld"),
    graph_store=Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password"),
    llm=OpenAIBackend(model="gpt-4o-mini"),
    workspace_id="default",
)
client.add("ACME acquired Beta in 2024.")
print(client.ask("Who did ACME acquire?", reasoning_mode=True, repair_budget=2))

4. Promote the same ontology into runtime artifacts

artifacts = client.approved_artifacts_from_ontology()
prompt_context = client.prompt_context_from_ontology(
    instructions=["Prefer finance ontology labels and relationships."]
)
draft = client.artifact_draft_from_ontology(name="finance_core_v1")

5. Run the local platform stack with UI + API + graph DB

make setup-env
make up
  • UI: http://localhost:8501
  • API docs: http://localhost:8001/docs
  • DozerDB browser: http://localhost:7474

See docs/FILES_AND_ARTIFACTS.md for where schema.jsonld, graph data, rule profiles, semantic artifacts, and traces live.

What the Ontology Controls

Stage What happens
Extraction Entity types + relationships in LLM prompt
Querying Schema-aware Cypher generation and repair prompts
Validation SHACL shapes derived โ†’ catches type/cardinality errors
Constraints UNIQUE/INDEX generated from ontology, applied to Neo4j
Denormalization Cardinality rules determine safe flattening
Glossary SKOS-style vocabulary terms, aliases, and hidden labels compiled into the ontology context identity
Reasoning Optional low-quality retry re-extracts with ontology guidance
Runtime parity Same ontology can be converted into approved semantic artifacts and typed prompt context
Agent context Stable ontology context hash follows indexing, graph writes, query traces, and agent hand-off metadata

Local SDK writes persist compact _ontology_* graph properties on nodes and relationships. Queries and agent tools compare the active ontology context hash with hashes in the graph and surface any mismatch as ontology_context_mismatch in trace/tool metadata โ€” a guardrail that signals when a graph may need re-indexing under a new ontology profile.

Key Features

# Index a directory (supports .txt, .md, .csv, .json, .jsonl, .pdf)
s.index_directory("./my_data/")

# Category-aware extraction (8 filing-domain presets)
s.add(text, category="Financials")

# Query with reasoning mode
s.ask("question", reasoning_mode=True, repair_budget=2)

# Swappable LLM providers (OpenAI, DeepSeek, Kimi, Grok, Qwen)
from seocho.store import OpenAIBackend, DeepSeekBackend
llm = OpenAIBackend(model="gpt-4o-mini")

# Agent session โ€” context persists across add/ask within one session
with s.session("my_analysis") as sess:
    sess.add("ACME acquired Beta in 2024.")
    sess.add("Beta provides risk analytics to ACME.")
    answer = sess.ask("What does ACME own or use?")

# Schema as code (JSON-LD canonical storage + SHACL export)
ontology.to_jsonld("schema.jsonld")
ontology = Ontology.from_jsonld("schema.jsonld")

# Ontology merge + diff (for migration)
combined = finance_onto.merge(legal_onto)

For the rest โ€” experiment workbench, tracing backends, supervisor + hand-off config, offline governance CLI, multi-ontology per database โ€” see seocho.blog/sdk.

SDK Package Structure

seocho/
โ”œโ”€โ”€ index/              โ† Data Plane: putting data IN
โ”‚   โ”œโ”€โ”€ pipeline.py     โ† chunk โ†’ extract โ†’ validate โ†’ rule inference โ†’ write
โ”‚   โ”œโ”€โ”€ linker.py       โ† embedding-based entity relatedness
โ”‚   โ””โ”€โ”€ file_reader.py  โ† .txt/.md/.csv/.json/.jsonl/.pdf
โ”œโ”€โ”€ query/              โ† Control Plane: getting data OUT
โ”‚   โ”œโ”€โ”€ strategy.py     โ† ontology โ†’ LLM prompt generation (cached)
โ”‚   โ””โ”€โ”€ cypher_builder.py โ† deterministic Cypher from intent
โ”œโ”€โ”€ store/              โ† Storage backends
โ”‚   โ”œโ”€โ”€ graph.py        โ† Neo4j/DozerDB + LadybugDB
โ”‚   โ”œโ”€โ”€ vector.py       โ† FAISS / LanceDB
โ”‚   โ””โ”€โ”€ llm.py          โ† OpenAI, DeepSeek, Kimi, Grok, Qwen
โ”œโ”€โ”€ rules.py            โ† SHACL-like rule inference + validation
โ”œโ”€โ”€ ontology.py         โ† Schema: JSON-LD + SHACL + merge + migration
โ”œโ”€โ”€ session.py          โ† Agent session: context cache + hand-off
โ”œโ”€โ”€ agents.py           โ† IndexingAgent / QueryAgent / Supervisor
โ”œโ”€โ”€ local_engine.py     โ† Local-mode orchestration behind the SDK facade
โ”œโ”€โ”€ client_remote.py    โ† HTTP transport behind the facade
โ”œโ”€โ”€ client_bundle.py    โ† Runtime-bundle glue behind the facade
โ””โ”€โ”€ client.py           โ† Public SDK facade

Three Ways to Use

Python SDK

from seocho import Seocho, Ontology, NodeDef, P

CLI

seocho init                    # create ontology interactively
seocho index ./data/           # index files
seocho ask "your question"     # query
seocho status                  # graph stats

Jupyter Notebook

examples/quickstart.ipynb
examples/bring_your_data.ipynb
examples/finance-compliance/quickstart.py

LPG and RDF Support

# LPG (default) โ€” Cypher queries
onto = Ontology(name="finance", graph_model="lpg", ...)

# RDF โ€” n10s Cypher (DozerDB + neosemantics)
onto = Ontology(name="fibo", graph_model="rdf",
                namespace="https://spec.edmcouncil.org/fibo/", ...)

Documentation

Doc Description
seocho.blog Full documentation site
SDK Overview SDK features and quick start
Ontology Guide Schema design, JSON-LD, SHACL
API Reference Complete method reference
docs/USECASES.md Runnable usecase demos
docs/ARCHITECTURE.md System architecture
docs/FILES_AND_ARTIFACTS.md Where ontology, rule, trace, and runtime files live
docs/BENCHMARKS.md Private finance corpus and GraphRAG-Bench evaluation tracks
docs/WORKFLOW.md Operational workflow
docs/ISSUE_TASK_SYSTEM.md Sprint/task governance
CONTRIBUTING.md How to contribute

Observability

Pluggable tracing backends selectable at runtime or via SEOCHO_TRACE_BACKEND:

  • none โ€” no tracing; smallest surface
  • console โ€” ephemeral stdout for local dev
  • jsonl โ€” canonical neutral trace artifact; file-based retention
  • opik โ€” optional exporter (hosted or self-hosted); SEOCHO_TRACE_OPIK_MODE=self_host for private infra

Sensitive workloads: prefer none or jsonl. Prompts, retrieval evidence, and metadata may appear in traces โ€” route remote exporters through your governance review. More detail at docs/FILES_AND_ARTIFACTS.md.

Server Mode (Platform Operators)

For the full platform with multi-agent debate, web UI, and Docker services:

make setup-env && make up
# UI: http://localhost:8501
# API: http://localhost:8001/docs
# DozerDB: http://localhost:7474

Default make up starts the core local stack: neo4j, extraction-service, evaluation-interface. The legacy semantic-service is opt-in:

docker compose --profile legacy-semantic up -d semantic-service

Scheduled Codex workflows skip cleanly when OPENAI_API_KEY / SEOCHO_GITHUB_APP_ID / SEOCHO_GITHUB_APP_PRIVATE_KEY are unset. Basic CI remains the required repository check surface.

See docs/QUICKSTART.md for the full server setup guide.

Contributing

git clone git@github.com:tteon/seocho.git && cd seocho
pip install -e ".[dev]"
scripts/pm/install-git-hooks.sh
python -m pytest seocho/tests/ -q

Pick a usecase to build around: docs/USECASES.md. Full guide in CONTRIBUTING.md.

License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seocho-0.3.2.tar.gz (326.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seocho-0.3.2-py3-none-any.whl (367.1 kB view details)

Uploaded Python 3

File details

Details for the file seocho-0.3.2.tar.gz.

File metadata

  • Download URL: seocho-0.3.2.tar.gz
  • Upload date:
  • Size: 326.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for seocho-0.3.2.tar.gz
Algorithm Hash digest
SHA256 4c177834a39909be633b21cd9268581906b0b5da86a9d1b112410981871ac93c
MD5 083743dceaa932f28827af65c5d39bfa
BLAKE2b-256 ff234f976eced4906a720db27141967b63a5745ad2ce3a9f9dbc15bf788ceaae

See more details on using hashes here.

File details

Details for the file seocho-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: seocho-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 367.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for seocho-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e9ead0a3c92a99b07ee65eb1fe1bcc2d4dd91e667a2efea463c058fde1db8ac2
MD5 7d7af7d14d42a817658e90000654214a
BLAKE2b-256 83a8db4d88b2f1516153a0fffacf5b2d73657e80f01a86c5cdac7b5bccabb64b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page