Heterogeneous Knowledge Graph engine for structured LLM reasoning

These details have not been verified by PyPI

Project links

Project description

CAS — Cognitive Agent Substrate

A structured knowledge representation and retrieval system built around a Heterogeneous Knowledge Graph (HKG). CAS organizes information across layered node types (L1–L5), enables graph-traversal-based reasoning, and compresses context into topology chains for efficient LLM consumption.

Architecture

cas/
├── core.py              # HKG storage engine (SQLite + in-memory cache)
├── embeddings.py        # Embedding engine (MiniLM local / OpenAI API)
├── l4_knowledge.py      # L4: graph traversal, path convergence, topology chains
├── l5_personalization.py# L5: user profile modeling and personalized routing
└── synthetic_data.py    # Synthetic graph and data generators for experiments

Node Types

Layer	Type	Description
L1	`SEMANTIC`	Raw semantic chunks, embedded and clustered
L2	`FACT`	Extracted facts and structured propositions
L3	`TRACE`	Causal event traces and audit logs
L4	`MACRO`	Generalized macro-nodes via cluster inheritance

Edge Types

Type	Discount	Description
`SEMANTIC`	0.7	Similarity-based links between chunks
`CAUSAL`	1.0	Directed causal relationships
`MACRO`	0.9	Cluster-level generalization edges

Experiments

Eight experiments validate the core claims of the CAS framework. Results and figures are included.

#	Name	Key Result
E1	Epistemic Accuracy	Pearson r = 0.71 (path convergence ↔ answer quality); Top-1 accuracy 80%
E2	Generalization Quality	Macro-node inheritance: precision 0.84 / recall 0.68 at threshold 0.7
E3	Compression Efficiency	Topology chains reduce token count by 72.2% vs raw context
E4	Traversal Scalability	Median latency 1.26 ms at 25K nodes / 250K edges
E5	Personalization Cost	Routing savings via L5 user profiles
E6	Causal Reasoning	Causal blast-radius propagation and containment
E7	End-to-End Quality	L4 topology chains: judge score 4.52 vs baseline-RAG 4.36, with 22% fewer tokens
E8	Personalization Quality	Cross-domain user adaptation metrics

Reproduce figures

python experiments/run_all.py

Each experiment can also be run individually:

python -m experiments.e1_epistemic.run
python -m experiments.e3_compression.run
# etc.

Figures are written to experiments/<name>/figures/ and results to experiments/<name>/results.json.

Setup

Requirements: Python 3.10+

pip install -r requirements.txt

API key (optional — only needed for OpenAI embedding backend):

cp .env.example .env
# edit .env and add your key

The default embedding backend is local MiniLM (all-MiniLM-L6-v2) and requires no API key.

Quick Start

from cas.core import HKGStore, Node, Edge, NodeType, EdgeType
from cas.embeddings import EmbeddingEngine
from cas.l4_knowledge import L4KnowledgeConsolidation

# Build a graph
store = HKGStore()
engine = EmbeddingEngine()

node = Node(node_id="n1", node_type=NodeType.L1_SEMANTIC, content="The cat is a mammal")
node.embedding = engine.encode("The cat is a mammal")[0]
store.add_node(node)

# Traverse and produce a topology chain
l4 = L4KnowledgeConsolidation(store, engine)
result = l4.query("What animals are mammals?")
print(result.chain_text)

Experiment Results

E1 — Epistemic Accuracy

Path convergence score reliably predicts answer quality (Pearson r = 0.71).

Metric	Value
Mean answer similarity	85.2%
Top-1 accuracy	80%
Hit-10 accuracy	90%
Mean traversal latency	11 ms

E1 Convergence vs Quality E1 Per-Query Top-1

E3 — Compression Efficiency

Topology chains compress graph context to ~28% of raw token count while preserving reasoning structure.

Metric	Value
Mean raw tokens	69.8
Mean chain tokens	19.0
Mean reduction	72.2%

E3 Compression

E4 — Traversal Scalability

Traversal latency stays sub-linear as graph size grows from 1K to 25K nodes.

Nodes	Median latency	p99 latency	Memory
1,000	0.76 ms	2.01 ms	1 MB
5,000	1.17 ms	4.07 ms	5 MB
10,000	1.12 ms	2.73 ms	10 MB
25,000	1.26 ms	3.16 ms	24 MB

E4 Scalability

E7 — End-to-End Quality

L4 topology chains match or exceed baseline RAG quality with fewer tokens.

Condition	Judge score (/5)	Input tokens
Baseline RAG	4.36	493
RAG filtered	4.32	387
L4 Topology	4.52	386

E7 Judge Scores E7 BERTScore

Author

Ahmet Yigit Sertel — April 2026

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognitive_agent_substrate-0.1.0.tar.gz (21.7 kB view details)

Uploaded Apr 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cognitive_agent_substrate-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Apr 5, 2026 Python 3

File details

Details for the file cognitive_agent_substrate-0.1.0.tar.gz.

File metadata

Download URL: cognitive_agent_substrate-0.1.0.tar.gz
Upload date: Apr 5, 2026
Size: 21.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cognitive_agent_substrate-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5d78adc4f2279ccc4caf034099a2636d2db16a37457ff2cdba982430f005d1c1`
MD5	`2472992ecc453483a01771a1acc9bac9`
BLAKE2b-256	`919af8b6dd1e460b2b76dda1861b0d21c02790cb46e18c87fc7b73a21cc27c42`

See more details on using hashes here.

File details

Details for the file cognitive_agent_substrate-0.1.0-py3-none-any.whl.

File metadata

Download URL: cognitive_agent_substrate-0.1.0-py3-none-any.whl
Upload date: Apr 5, 2026
Size: 20.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cognitive_agent_substrate-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`270881d50ac6914ffd22b89662bccbca1219a7093a8ace65b502c4ceb0591a3a`
MD5	`8e432dea5665ec2c5e2b8f65d5bb2471`
BLAKE2b-256	`3fd217396a257f3379fe92190baec124a0569db01e7d42c77f63b276eff26ef0`

See more details on using hashes here.

cognitive-agent-substrate 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CAS — Cognitive Agent Substrate

Architecture

Node Types

Edge Types

Experiments

Reproduce figures

Setup

Quick Start

Experiment Results

E1 — Epistemic Accuracy

E3 — Compression Efficiency

E4 — Traversal Scalability

E7 — End-to-End Quality

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes