kg-rag

Orchestration layer for the KGRAG(tm) components.

These details have not been verified by PyPI

Project links

Project description

KGRAG logo

KGRAG — Knowledge Compiler and Federated Retrieval Layer for Ontologically Grounded Domains

Patent Pending — The Knowledge Compiler concept and its execution are the subject of a pending U.S. provisional patent application.

Author: Eric G. Suchanek, PhD · Flux-Frontiers, Liberty TWP, OH

Overview

KGRAG is a federation and orchestration layer for structural knowledge graphs derived from heterogeneous source domains. It integrates PyCodeKG (Python codebase analysis), DocKG (semantic document indexing), MetaboKG (metabolic pathways), DiaryKG (personal diary corpora), AgentKG (conversational memory), FTreeKG (file system trees), and a growing family of domain-specific backends under a single five-method adapter protocol.

KGRAG treats derived structure as ground truth and uses semantic embeddings strictly as an acceleration layer for locating entry points into that structure. All graph traversal, ranking, and snippet extraction is deterministic. When KGRAG output is passed to a language model for synthesis, the model receives verified facts with full source provenance — not approximate embeddings.

KG Types

Fully Implemented

Kind	Backend	Description
`code`	PyCodeKG	Python codebase — AST-extracted modules, classes, functions, call graphs
`doc`	DocKG	Document corpus — Markdown/RST/text indexed by topic, section, and entity
`meta`	MetaboKG	Metabolic pathways — biochemical reaction networks (KEGG, BioCyc)
`diary`	DiaryKG	Personal diary entries — timestamped chunk graphs with temporal edges
`agent`	AgentKG	Conversational memory — Turn/Topic/Task/Summary graph (live session)
`filetree`	FTreeKG	File system tree — directory/file/module/dependency structure
`memory`	MemoryKG	Episodic memory — hybrid semantic + structural graph for conversation/event corpora

Stub Adapters (protocol boundary, backends under development)

Kind	Backend	Description
`gutenberg`	GutenbergKG	Project Gutenberg book corpus — literature indexed by author, genre, and chapter
`ia`	IABookKG	Internet Archive book corpus — public-domain books indexed by genre and topic
`pdbfile`	—	PDB structure files — 3D atomic coordinates and protein metadata
`disulfide`	—	Disulfide bond data — cysteine connectivity in protein structures
`verse`	—	Scripture/verse — Book → Chapter → Verse hierarchy and cross-references
`person`	—	Personal knowledge — biographical and relational graphs
`legal`	—	Legal corpus — statutory codes and regulations (TBD)

Corpus Abstractions

Generic Corpus — A named collection of any KG instances grouped for scoped federated queries. Useful for project-level or thematic groupings (e.g., "KGRAG_repos" combining code + doc KGs).

Person Corpus — A corpus enriched with personal metadata representing an individual. Aggregates all KGs relevant to a person — diaries, memories, documents, agent sessions, and more — alongside structured personal data (birth year, address, email, contact info).

Features

Multi-domain federation — Query code, docs, metabolic pathways, diary entries, and conversation history simultaneously
Five-method adapter protocol — is_available, query, pack, stats, analyze; add a new domain by implementing five methods
Unified registry — Persistent SQLite-backed storage of KG locations, metadata, corpora, and person records
Corpus abstraction — Group KGs into named corpora for scoped federated queries
Person corpus — Model individuals with personal metadata and their associated KG collections
Hybrid querying — Semantic seeding via LanceDB + structural BFS traversal
Context packing — Extract source-grounded snippets with line numbers for direct LLM ingestion
MCP server — 22 tools exposing registry, corpus, and person operations to any MCP-compatible agent
CLI tooling — Full CRUD for KGs, corpora, and person corpora; query, pack, analyze, synthesize
Streamlit dashboard — Interactive browser for exploring and querying registered knowledge graphs
Deterministic retrieval — Auditable, source-grounded results; zero hallucination at the knowledge layer

Quick Start

1. Install KGRAG

pip install 'kg-rag @ git+https://github.com/Flux-Frontiers/KGRAG.git'

# With Streamlit dashboard
pip install 'kg-rag[viz] @ git+https://github.com/Flux-Frontiers/KGRAG.git'

2. Register a Knowledge Graph

# Register a Python codebase (requires pycode-kg built in that repo)
kgrag register my-code code /path/to/my-repo

# Register a document corpus (requires doc-kg built in that repo)
kgrag register my-docs doc /path/to/docs-repo

# Register a diary corpus
kgrag register pepys-diary diary /path/to/diary-repo

3. Query Your Graphs

# Federated query across all registered KGs
kgrag query "authentication flow"

# Federated snippet pack for LLM ingestion
kgrag pack "database connection setup" --out context.md

# Scope to a specific corpus
kgrag query "disulfide bond patterns" --scope my-corpus
kgrag pack "journal entries about travel" --scope alice

4. Launch the Dashboard

kgrag viz

CLI Reference

Registry Management

Command	Description
`kgrag register <name> <kind> <path>`	Register a KG instance
`kgrag unregister <name>`	Remove a KG from the registry
`kgrag list [--kind <kind>]`	List all registered KGs
`kgrag info <name>`	Show detailed info for a KG
`kgrag status [--stats]`	Check health and live stats
`kgrag init`	Interactively register a new KG

Query & Analysis

Command	Description
`kgrag query <q> [--kind <kind>] [--scope <name>]`	Federated semantic query
`kgrag pack <q> [--kind <kind>] [--scope <name>] [--out <file>]`	Snippet pack for LLM
`kgrag analyze <name>`	Full analysis report for one KG
`kgrag synthesize <q>`	KG-grounded synthesis via local LLM (Ollama)

Corpus Management

Command	Description
`kgrag corpus create <name>`	Create a named corpus
`kgrag corpus add <corpus> <kg>`	Add a KG to a corpus
`kgrag corpus remove <corpus> <kg>`	Remove a KG from a corpus
`kgrag corpus list`	List all corpora
`kgrag corpus query <name> <q>`	Query within a corpus
`kgrag corpus pack <name> <q>`	Snippet pack within a corpus

Person Corpus Management

Command	Description
`kgrag person create <name>`	Create a person corpus
`kgrag person add <person> <kg>`	Add a KG to a person corpus
`kgrag person update <name> [--email ...] [--notes ...]`	Update personal metadata
`kgrag person query <name> <q>`	Query across a person's KGs
`kgrag person pack <name> <q>`	Snippet pack for a person

Server & Integration

Command	Description
`kgrag mcp`	Launch MCP server (stdio transport)
`kgrag viz`	Launch Streamlit dashboard
`kgrag hooks install`	Install pre-commit snapshot hook

MCP Integration

Launch the MCP server:

kgrag mcp

The server exposes 22 tools to any MCP-compatible agent (Claude Code, Cursor, GitHub Copilot, Cline, Claude Desktop):

Registry tools:

Tool	Description
`kgrag_stats()`	Registry summary: KG count, kinds, built status
`kgrag_list([kind])`	List registered KG entries
`kgrag_info(name)`	Full detail for a single KG entry
`kgrag_query(q, [k, kinds])`	Federated semantic query, JSON result
`kgrag_pack(q, [k, kinds])`	Federated snippet pack, Markdown output

Corpus tools: kgrag_corpus_list, kgrag_corpus_info, kgrag_corpus_create, kgrag_corpus_delete, kgrag_corpus_add, kgrag_corpus_remove, kgrag_corpus_query, kgrag_corpus_pack

Person tools: kgrag_person_list, kgrag_person_info, kgrag_person_create, kgrag_person_delete, kgrag_person_add, kgrag_person_remove, kgrag_person_update, kgrag_person_query, kgrag_person_pack

Architecture

Source Domains
     ↓
PyCodeKG  DocKG  MetaboKG  DiaryKG  AgentKG  FTreeKG  MemoryKG  GutenbergKG  IABookKG  … (stubs)
  SQLite + LanceDB per backend
     ↓
  ┌─────────────────────────────────────────────────────────┐
  │          KGAdapter (five-method protocol)               │
  ├─────────────────────────────────────────────────────────┤
  │   KGRAG Orchestrator · KGRegistry · CorpusRegistry      │
  │              PersonCorpusRegistry                       │
  └─────────────────────────────────────────────────────────┘
             ↓                         ↓
        CLI / Python API          MCP Server (stdio)
     (query, pack, analyze)   (AI agents, Claude Code)

Design Principles

Derived structure is authoritative — graphs are extracted from formal sources by deterministic programs; embeddings are derived and disposable
Semantics accelerate; structure decides — vector search locates entry points; BFS traversal determines what is returned
Every result is traceable — every node carries a stable identifier encoding its origin
Determinism over approximation — identical inputs produce identical outputs
Generality through protocol — five adapter methods; no orchestrator changes needed for new domains
Independence from language models — the full build and query pipeline runs locally without any LLM call

Project Structure

src/kg_rag/
├── orchestrator.py          # KGRAG — cross-KG orchestrator
├── registry.py              # KGRegistry — SQLite-backed KG registry
├── corpus_registry.py       # CorpusRegistry — named corpus groups
├── person_registry.py       # PersonCorpusRegistry — person-centric corpora
├── primitives.py            # KGKind, KGEntry, CrossHit, CrossSnippet, …
├── embed.py                 # Embedder abstraction (SentenceTransformer, LlamaCpp)
├── adapters/
│   ├── base.py              # KGAdapter ABC (five abstract methods)
│   ├── _stub_adapter.py     # StubKGAdapter base for unbuilt backends
│   ├── pycodekg_adaptor.py  # CodeKGAdapter  (code)
│   ├── dockg_adapter.py     # DocKGAdapter   (doc)
│   ├── metakg_adapter.py    # MetaKGAdapter  (meta / MetaboKG)
│   ├── diary_adapter.py     # DiaryKGAdapter (diary)
│   ├── agent_adapter.py     # AgentKGAdapter (agent)
│   ├── memory_adapter.py    # MemoryKGAdapter (memory)
│   ├── gutenberg_adapter.py # stub (gutenberg)
│   ├── ia_adapter.py        # stub (ia)
│   ├── disulfide_adapter.py # stub
│   ├── pdbfile_adapter.py   # stub
│   ├── verse_adapter.py     # stub
│   ├── legal_adapter.py     # stub
│   └── person_adapter.py    # stub
├── cli/
│   ├── main.py              # root Click group
│   ├── cmd_registry.py      # register, unregister, list, info, status, init
│   ├── cmd_query.py         # query, pack
│   ├── cmd_corpus.py        # corpus CRUD + query/pack
│   ├── cmd_analyze.py       # analyze
│   ├── cmd_synthesize.py    # synthesize (Ollama-grounded)
│   ├── cmd_mcp.py           # mcp
│   ├── cmd_viz.py           # viz (Streamlit)
│   ├── cmd_hooks.py         # hooks install
│   └── cmd_models.py        # models (embedder config)
├── mcp_server.py            # MCP server (22 tools, stdio transport)
└── app.py                   # Streamlit dashboard

Installation

Requirements: Python ≥ 3.12, < 3.14

# Core
pip install 'kg-rag @ git+https://github.com/Flux-Frontiers/KGRAG.git'

# With Streamlit dashboard
pip install 'kg-rag[viz] @ git+https://github.com/Flux-Frontiers/KGRAG.git'

# Poetry
poetry add 'kg-rag @ git+https://github.com/Flux-Frontiers/KGRAG.git'

Embedding Backend (ARM / Raspberry Pi)

KGRAG supports llama.cpp-based embedding for low-power deployment. Configure in pyproject.toml:

[tool.kgrag]
embed_backend    = "llama"
llama_model_path = "~/.kgrag/bge-small-en-v1.5-Q8_0.gguf"

Related Projects

Project	Description
PyCodeKG	Deterministic knowledge graph for Python codebases
DocKG	Semantic knowledge graph for document corpora
MetaboKG	Metabolic pathway knowledge graph
DiaryKG	Diary and personal journal corpus knowledge graph
AgentKG	Conversational memory knowledge graph
FTreeKG	File system tree knowledge graph
MemoryKG	Episodic memory knowledge graph for conversation and event corpora
GutenbergKG	Project Gutenberg book corpus knowledge graph (under development)
IABookKG	Internet Archive book corpus knowledge graph (under development)

License

Elastic License 2.0 — see LICENSE.

Free to use, modify, and distribute. You may not offer the software as a hosted or managed service to third parties. Commercial internal use is permitted.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.0

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kg_rag-0.6.0.tar.gz (100.1 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kg_rag-0.6.0-py3-none-any.whl (125.8 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file kg_rag-0.6.0.tar.gz.

File metadata

Download URL: kg_rag-0.6.0.tar.gz
Upload date: May 4, 2026
Size: 100.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/25.4.0

File hashes

Hashes for kg_rag-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`d6b67c38aaf77e15a01abddce785ef944fc3f35a9791330ea8e7ed4d7ad9ef86`
MD5	`a43f025d42df9dc6e1b8cf25c63f86c8`
BLAKE2b-256	`ab3857a392823ffd4a143daf93a1bd63e25c091619e0e5147baa319e0996028e`

See more details on using hashes here.

File details

Details for the file kg_rag-0.6.0-py3-none-any.whl.

File metadata

Download URL: kg_rag-0.6.0-py3-none-any.whl
Upload date: May 4, 2026
Size: 125.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Darwin/25.4.0

File hashes

Hashes for kg_rag-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d58b6a2af5df61664f5ed8529fa12225d3d6ad9d24a701c0a1961b65a40fad3`
MD5	`2cf9b30d70cf670f87041cd5ccd0a5bc`
BLAKE2b-256	`f411a66d2886a5ef007b13318c501eb3d81e09da1173dbe72b95beea42b1791c`

See more details on using hashes here.

kg-rag 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Overview

KG Types

Fully Implemented

Stub Adapters (protocol boundary, backends under development)

Corpus Abstractions

Features

Quick Start

1. Install KGRAG

2. Register a Knowledge Graph

3. Query Your Graphs

4. Launch the Dashboard

CLI Reference

Registry Management

Query & Analysis

Corpus Management

Person Corpus Management

Server & Integration

MCP Integration

Architecture

Design Principles

Project Structure

Installation

Embedding Backend (ARM / Raspberry Pi)

Related Projects

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes