Developer-friendly Python SDK for the SEOCHO graph-memory runtime
Project description
SEOCHO
Ontology-driven knowledge graph library for Python
Define your schema once — it drives extraction, querying, validation, and graph-governance artifacts from one contract.
Install
pip install seocho
Optional offline ontology governance tooling:
pip install "seocho[ontology]"
Quick Start
from seocho import Seocho, Ontology, NodeDef, RelDef, P
from seocho.store import Neo4jGraphStore, OpenAIBackend
# 1. Define your schema
ontology = Ontology(
name="my_domain",
package_id="org.example.my_domain",
nodes={
"Person": NodeDef(properties={"name": P(str, unique=True)}),
"Company": NodeDef(properties={"name": P(str, unique=True)}),
},
relationships={
"WORKS_AT": RelDef(source="Person", target="Company"),
},
)
# 2. Connect
s = Seocho(
ontology=ontology,
graph_store=Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password"),
llm=OpenAIBackend(model="gpt-4o"),
)
# 3. Index
s.add("Marie Curie worked at the University of Paris.")
# 4. Query
print(s.ask("Where did Marie Curie work?"))
What the Ontology Controls
| Stage | What happens |
|---|---|
| Extraction | Entity types + relationships in LLM prompt |
| Querying | Schema-aware Cypher generation and repair prompts |
| Validation | SHACL shapes derived → catches type/cardinality errors |
| Constraints | UNIQUE/INDEX generated from ontology and can be applied to Neo4j |
| Denormalization | Cardinality rules determine safe flattening |
| Reasoning | Optional low-quality retry re-extracts with ontology guidance |
Key Features
# Index files from a directory
s.index_directory("./my_data/") # .txt, .md, .csv, .json, .jsonl, .pdf
# Category-specific extraction (auto-selects prompt)
s.add(text, category="Financials") # 8 FinDER domain presets
# Query with reasoning mode
s.ask("question", reasoning_mode=True, repair_budget=2)
# Multiple LLM providers
from seocho.store import OpenAIBackend
llm = OpenAIBackend(model="gpt-4o-mini") # OpenAI
llm = OpenAIBackend(model="deepseek-chat", base_url="https://api.deepseek.com/v1") # DeepSeek
# Multi-ontology per database
s.register_ontology("finance_db", finance_ontology)
# Schema as code (JSON-LD canonical storage)
ontology.to_jsonld("schema.jsonld")
ontology = Ontology.from_jsonld("schema.jsonld")
# Apply generated Neo4j constraints explicitly in local mode
s.ensure_constraints(database="neo4j")
# Offline ontology governance helpers
# seocho ontology check --schema schema.jsonld
# seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
# seocho ontology diff --left schema_v1.jsonld --right schema_v2.jsonld
# diff output now includes package_id, recommended version bump, and migration warnings
# Experiment workbench
from seocho.experiment import Workbench
wb = Workbench(input_texts=["text..."])
wb.vary("ontology", ["v1.jsonld", "v2.jsonld"])
wb.vary("model", ["gpt-4o", "gpt-4o-mini"])
results = wb.run_all()
print(results.leaderboard())
# Pluggable tracing
from seocho import enable_tracing, configure_tracing_from_env
enable_tracing(backend="none") # disable tracing explicitly
enable_tracing(backend="console") # stdout only
enable_tracing(backend="jsonl") # canonical neutral trace artifact
enable_tracing(backend="opik") # optional exporter (hosted or self-hosted)
configure_tracing_from_env() # SEOCHO_TRACE_BACKEND=none|console|jsonl|opik
# Agent design configuration
from seocho import AgentConfig, AGENT_PRESETS
s = Seocho(ontology=onto, ..., agent_config=AGENT_PRESETS["strict"])
# Agent-level session (context persists across operations)
with s.session("my_analysis") as sess:
sess.add("Samsung CEO Jay Y. Lee reported $234B revenue.")
sess.add("Apple CEO Tim Cook reported $383B revenue.")
answer = sess.ask("Compare Samsung and Apple revenue")
# → structured entity context passed to QueryAgent
# Supervisor with sub-agent hand-off (explicit opt-in)
from seocho import RoutingPolicy
s = Seocho(ontology=onto, ..., agent_config=AgentConfig(
execution_mode="supervisor", handoff=True,
routing_policy=RoutingPolicy(latency=0.1, token_efficiency=0.3, information_quality=0.6),
))
with s.session("auto") as sess:
sess.run("Samsung CEO is Jay Y. Lee") # → IndexingAgent
sess.run("Who is Samsung's CEO?") # → QueryAgent
# Ontology merge (combine two schemas)
finance = Ontology.from_jsonld("finance.jsonld")
legal = Ontology.from_jsonld("legal.jsonld")
combined = finance.merge(legal) # union of nodes + relationships
combined.to_jsonld("combined.jsonld")
SDK Package Structure
seocho/
├── index/ ← Data Plane: putting data IN
│ ├── pipeline.py ← chunk → extract → validate → write
│ └── file_reader.py ← .txt/.md/.csv/.json/.jsonl/.pdf
├── query/ ← Control Plane: getting data OUT
│ ├── strategy.py ← ontology → LLM prompt generation
│ └── cypher_builder.py ← deterministic Cypher from intent
├── store/ ← Storage backends
│ ├── graph.py ← Neo4j/DozerDB
│ ├── vector.py ← FAISS / LanceDB
│ └── llm.py ← OpenAI, DeepSeek, Kimi, Grok
├── ontology.py ← Schema: JSON-LD + SHACL + denormalization + merge
├── session.py ← Agent session: context cache + hand-off
├── agents.py ← IndexingAgent / QueryAgent / Supervisor
├── tools.py ← @function_tool definitions for agents
├── agent_config.py ← AgentConfig, RoutingPolicy, presets
├── experiment.py ← Workbench for parameter exploration
├── tracing.py ← Pluggable observability
└── client.py ← Seocho unified interface
Three Ways to Use
Python SDK (developers)
from seocho import Seocho, Ontology, NodeDef, P
CLI (no code needed)
seocho init # create ontology interactively
seocho index ./data/ # index files
seocho ask "your question" # query
seocho status # graph stats
seocho experiment --input ... # parameter exploration
Jupyter Notebook (data analysts)
examples/quickstart.ipynb
examples/bring_your_data.ipynb
LPG and RDF Support
# LPG mode (default) — Cypher queries
onto = Ontology(name="finance", graph_model="lpg", ...)
# RDF mode — n10s Cypher (DozerDB + neosemantics)
onto = Ontology(name="fibo", graph_model="rdf",
namespace="https://spec.edmcouncil.org/fibo/", ...)
Documentation
| Doc | Description |
|---|---|
| seocho.blog | Full documentation site |
| SDK Overview | SDK features and quick start |
| Ontology Guide | Schema design, JSON-LD, SHACL |
| API Reference | Complete method reference |
| Examples | Real-world patterns |
| CONTRIBUTING.md | How to contribute |
| docs/ARCHITECTURE.md | System architecture |
| docs/WORKFLOW.md | Operational workflow |
| docs/ISSUE_TASK_SYSTEM.md | Sprint/task governance |
Observability Modes
none: no tracing; smallest surface and lowest data retention risk.console: ephemeral stdout debugging for local development.jsonl: canonical neutral trace artifact for local files, replay, and vendor-neutral retention.opik: optional exporter/backend for hosted or self-hosted team observability.
Recommended defaults:
- sensitive data or simple local usage:
noneorjsonl - team debugging and evaluation:
jsonl + opik - private infra: self-hosted Opik with
SEOCHO_TRACE_OPIK_MODE=self_host
Retention and privacy guidance:
- JSONL retention follows your filesystem policy; rotate or delete trace files explicitly.
- Opik retention follows the target Opik deployment policy, whether hosted or self-hosted.
- prompts, retrieval evidence, and metadata may appear in traces; avoid remote exporters for sensitive workloads unless governance is approved.
Server Mode (Platform Operators)
For the full platform with multi-agent debate, web UI, and Docker services:
make setup-env && make up
# UI: http://localhost:8501
# API: http://localhost:8001/docs
# DozerDB: http://localhost:7474
See docs/QUICKSTART.md for the full server setup guide.
Contributing
git clone git@github.com:tteon/seocho.git && cd seocho
pip install -e ".[dev]"
python -m pytest seocho/tests/ -q
See CONTRIBUTING.md for the full guide.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seocho-0.3.0.tar.gz.
File metadata
- Download URL: seocho-0.3.0.tar.gz
- Upload date:
- Size: 160.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a86156b520d8cc1b4624f1a64df7e56f383fd3bdab1cb929d1b3d60c32afefd5
|
|
| MD5 |
160a7a4136658d5b7c7cb07ad122e01d
|
|
| BLAKE2b-256 |
f7c24c96e4c9e738703b922f9e4e5d8a5b8f1e0fac505e8a777ec57bd237159c
|
File details
Details for the file seocho-0.3.0-py3-none-any.whl.
File metadata
- Download URL: seocho-0.3.0-py3-none-any.whl
- Upload date:
- Size: 175.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89c9d44ba78b8919997a702571d8addf450e20310d9e1257455bbb2e7b7d3ce5
|
|
| MD5 |
b4be31fb6c9a184d7ef23edf3c87eeff
|
|
| BLAKE2b-256 |
46f9640f14038901b498321a6115841c04e9c4da5b92baf39d29eb94a50be748
|