SozoGraph v1: transcript/db object -> portable cognitive passport JSON
Project description
SozoGraph (v1) — The Cognitive Passport
SozoGraph turns interaction history (transcripts + DB objects) into a portable cognitive snapshot you can pass into any AI agent context on the fly.
It answers one question cleanly:
"Given everything that has happened so far, what should an agent currently believe about this user?"
Not:
- what was said
- what is similar
- what might be relevant
But:
- what is true now
- what is stable
- what is unresolved
- what is contradictory (resolved by time)
Why this exists (the problem)
Most "memory" systems are either:
- prompt stuffing (expensive, degrades reasoning, no forgetting)
- vector RAG (good recall, weak truth/temporal consistency)
- app-specific notes (non-portable, brittle schemas)
So agents keep acting like "goldfish" even when data exists.
SozoGraph v1 is a truth-layer memory object:
- typed (facts vs preferences vs entities vs open loops)
- temporal (new updates override old; contradictions are explicit)
- portable (a lightweight JSON passport + a compact context string)
Install
pip install sozograph
Configure
Create a .env file (see .env.example):
GEMINI_API_KEY=your_key_here
SOZOGRAPH_EXTRACTOR_MODEL=gemini-3-flash
SOZOGRAPH_ENABLE_FALLBACK_SUMMARIZER=true
SOZOGRAPH_MAX_INTERACTION_CHARS=4000
SOZOGRAPH_DEFAULT_CONTEXT_BUDGET=3000
Quickstart
1) Single transcript → Passport
from sozograph import SozoGraph
sg = SozoGraph()
passport, stats = sg.ingest(
"I'm Quantilytix. I build software and want direct answers. I'm working on SozoGraph v1.",
meta={"user_key": "u_123", "source": "transcript:demo-1"}
)
print(passport.to_compact_dict())
print(stats) # per-interaction merge stats
2) List of transcripts / message history (supported ✅)
history = [
{"createdAt": "2026-02-01T10:00:00Z", "project_title": "SozoFix", "transcript": "I'm renovating my kitchen."},
{"createdAt": "2026-02-02T09:30:00Z", "project_title": "SozoFix", "transcript": "I prefer rustic style and hate glossy paint."},
{"createdAt": "2026-02-03T12:10:00Z", "project_title": "SozoGraph", "transcript": "We need portable memory JSON. No infra. Truth-layer."},
]
# You can ingest a list directly. SozoGraph will coerce items internally.
passport, _ = sg.ingest(history, hint="firestore") # hint optional; see below
Tip: If your list items aren’t “docs”, you can pass them as plain dicts and let fallback summarization help when needed. If your dicts contain a transcript field, extraction will still succeed (it will stringify deterministically).
3) Firestore object ingestion (objects-only)
You fetch your Firestore data in your app, then pass the dict here:
firestore_doc = {
"id": "abc123",
"createdAt": "2026-02-03T10:00:00Z",
"title": "User Profile Update",
"notes": "User says they prefer direct answers.",
"companyCode": "QX",
}
passport, _ = sg.ingest(
firestore_doc,
hint="firestore",
meta={"source": "firestore:/users/abc123", "user_key": "u_abc123"}
)
4) Firebase Realtime DB ingestion (path + value)
RTDB is tree-based, so pass an envelope:
rtdb_snapshot = {
"path": "/users/u1/profile",
"value": {
"updatedAt": 1738560000000,
"displayName": "Quantilytix",
"preferences": {"tone": "direct"}
}
}
passport, _ = sg.ingest(rtdb_snapshot, hint="rtdb", meta={"user_key": "u1"})
5) Supabase ingestion (table + row)
supabase_row = {
"table": "events",
"row": {
"id": 77,
"created_at": "2026-02-03T11:22:00Z",
"event": "user_preference_update",
"notes": "User wants strategy alignment before code."
}
}
passport, _ = sg.ingest(supabase_row, hint="supabase", meta={"user_key": "u1"})
SozoGraph Test Fixtures
These fixtures are intentionally small and human-readable.
They are designed to test:
- transcript ingestion
- Firestore document ingestion
- Firebase Realtime Database snapshots
- Supabase row ingestion
They are NOT meant to simulate production-scale data. If a fixture grows beyond what a human would comfortably read, it is probably violating SozoGraph v1 philosophy.
Export a compact agent “briefing” (context injection)
You can inject this into any agent prompt:
briefing = sg.export_context(passport, budget_chars=2500)
print(briefing)
Example output format:
SOZOGRAPH PASSPORT v1
User: u1
Updated: 2026-02-03T12:34:56+00:00
Facts (current beliefs):
- role: software development
- current_project: sozograph v1
...
Preferences:
- tone: direct
...
Open loops:
- finalize v1 repo + publish pip package
...
How SozoGraph v1 works
Ingestion pipeline (v1)
- Coerce input into canonical
Interactionobjects (deterministic) - If the derived text is weak/noisy, call Gemini fallback summarizer (optional)
- Use Gemini extractor (strict JSON) to propose memory updates
- Use deterministic resolver to merge:
- temporal priority (latest wins)
- explicit contradictions record changes
- de-dupe entities + aliases
- keep open loops short and recent
What SozoGraph v1 is NOT
- Not a graph database
- Not RAG
- Not embeddings
- Not a long transcript store
- Not a tool that fetches from DB (objects-only by design)
Roadmap (upcoming features)
v1.x (near-term)
- Better input detection for common “transcript list” shapes (e.g.
{transcript, createdAt}) - CLI:
sozograph ingest transcript.txt --out passport.jsonsozograph render passport.json --budget 3000
- Stronger JSON recovery if a model response is slightly malformed
- More deterministic evidence linking (source-id mapping improvements)
v1.5 (planned, optional)
- Graph engine support (Neo4j Aura / Memgraph) via Bolt
- Cypher-style relational queries over memory
- Temporal deprecation on edges
- Export “active truth subgraph” to context
v2 (optional)
- Foundational model adapters (non-Gemini backends)
- MCP tool server integration
- Hybrid patterns (graph + vector) only where needed
Contributing
We want contributions, but keep v1 disciplined.
Good contributions
- Adapters for additional object shapes (still objects-only)
- Resolver improvements (deterministic)
- Tests for merge/contradiction edge-cases
- Prompt tuning for more stable key extraction
What won’t be accepted in v1
- Adding DB client dependencies (firebase-admin, supabase clients, etc.)
- Building RAG/embeddings into core
- Turning v1 into a graph project
How to contribute
- Fork the repo
- Create a branch:
feat/<short-name> - Add tests where relevant
- Open a PR with a short explanation and sample input/output
License
MIT — Sozo Analytics Lab
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sozograph-0.1.0.tar.gz.
File metadata
- Download URL: sozograph-0.1.0.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c9bbe9c158766a636c082153f2ca621fecbfe07e32143e4433ed8bcfa504807
|
|
| MD5 |
1e0b5201d61a52ca3f68fc04189ef4d8
|
|
| BLAKE2b-256 |
a98af24b57e6a35dc96c4630e2f967fd1963dd8d8b78b4380b81e2e24e2911de
|
File details
Details for the file sozograph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sozograph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f02d7ca906051d0c1b0421b319cd741b202092332584e11cc082c61b270fe13
|
|
| MD5 |
0b90c540a5ef0128445600c6e0e169f5
|
|
| BLAKE2b-256 |
94833adcaafebbfd18e4bee25138ded0ef37be65dd544fa1b7ba025f916f5bea
|