Add your description here

Project description

kgnode

Knowledge Graph Agnostic Node for Knowledge-Aware LLM Applications

Overview

kgnode is a Python library that extracts relevant subgraphs from large knowledge graphs using a path-aware Markov chain algorithm for question answering tasks.

Implementation Summary:

Kgnode - work in progress
Initial Dataset: DBLP-QuAD
Knowledge graph embedding ❌
Simple text embedding with basic template ✅
Initial Vector DB: ChromaDB
Framework: LangGraph
Seed node identification strategy:
- SPARQL text search (1-hop nodes)
- High-frequency node (degree) semantic search (2-3 hop nodes)
- Compile VectorDB with top 1 million nodes
Node pruning algorithm: Path-aware Markov chain (relevant subgraph identification)
- P(v→w) ∝ base_weight(v,w) × f(history,v,w)
- Initially using P(v→w) ∝ softmax(cos(path_embedding, template_embedding))
- path_embedding == f(a, r, b, r, v, r, w)
- Query → template → template_embedding
- Stops when p gets smaller than previous step or reaches 10 hops
Generate SPARQL for answering the query, using the subgraph as context
Generate answer of the query by executing SPARQL and using subgraph

Installation

pip install kgnode

Quick Start

from kgnode import KGConfig, get_seed_nodes, get_subgraphs, generate_answer

# Configure for your knowledge graph
config = KGConfig(
    sparql_endpoint="http://localhost:7878/query",
    embedding_model="all-MiniLM-L6-v2"
)

# Find seed nodes for a query
seed_nodes = get_seed_nodes(query="What papers did John Smith publish?", config=config)

# Extract relevant subgraph
subgraphs = get_subgraphs(seed_node=seed_nodes[0], query="...", config=config)

# Generate answer
answer = generate_answer(query="...", config=config)

Folder Structure

kgnode/
├── src/kgnode/
│   ├── __init__.py              # Public API exports
│   ├── seed_finder.py           # Seed node identification
│   ├── subgraph_extraction.py   # Path-aware Markov chain algorithm
│   ├── generator.py             # SPARQL generation and answer generation
│   ├── validator.py             # Subgraph validation
│   ├── keyword_search.py        # Keyword-based entity search
│   ├── chroma_db.py            # Vector database operations
│   └── core/
│       ├── kg_config.py        # Configuration class
│       ├── sparql_query.py     # SPARQL endpoint communication
│       ├── schema_extractor.py # Schema extraction from ontology/SPARQL
│       ├── schema_chromadb.py  # Schema ChromaDB collections
│       └── schema_selector.py  # Query-aware schema selection
├── tests/                       # Unit tests
├── docs/                        # Documentation
└── _data/                       # Data files (not in repo)

Running Oxigraph SPARQL Server

kgnode requires a SPARQL endpoint. We recommend Oxigraph:

# Start server (read-write)
oxigraph_server serve -l ./oxigraph_db --cors

# Start server (read-only)
oxigraph_server serve-read-only -l ./oxigraph_db --cors

# Load dataset (one-time setup)
oxigraph_server load -l ./oxigraph_db -f _data/dblp.nt

# Custom bind address
oxigraph_server serve -l ~/oxigraph_db --bind 127.0.0.1:7878

Default endpoint: http://localhost:7878/query

Public API

Main Pipeline

from kgnode import (
    citable,                    # Check seed node quality
    get_seed_nodes,             # Find seed nodes (keyword + semantic search)
    get_subgraphs,              # Extract subgraph using path-aware Markov chain
    generate_sparql,            # Generate SPARQL from subgraph
    kg_retrieve,                # Full pipeline: query → subgraph → SPARQL → results
    generate_answer,            # End-to-end answer generation
    generate_answer_using_subgraph,  # Answer generation from subgraph
)

VectorDB Operations

from kgnode import (
    compile_chromadb,           # Build vector DB from knowledge graph
    compile_chromadb_from_csv,  # Build from existing CSV
    semantic_search_entities,   # Semantic search for entities
    load_chromadb,              # Load existing ChromaDB collection
    add_or_update_entities,     # Add/update entity embeddings
    delete_entities,            # Remove entities from vector DB
)

Search Operations

from kgnode import search_entities_by_keywords  # SPARQL keyword search

Validation

from kgnode import validate_subgraph  # Validate extracted subgraph

Core Configuration

from kgnode import KGConfig, execute_sparql_query

# Create configuration
config = KGConfig(
    sparql_endpoint="http://localhost:7878/query",
    embedding_model="all-MiniLM-L6-v2",
    openai_model="gpt-4o-mini"
)

# Execute SPARQL queries
results = execute_sparql_query(query="SELECT * WHERE { ?s ?p ?o } LIMIT 10", config=config)

TODOs

LangGraph Integration

Orchestrate workflow with LangGraph
Add visualization support

Documentation

For detailed usage, API reference, and examples, see docs/USAGE.md or visit the online documentation.

Dataset

DBLP-QuAD - Academic publications knowledge graph

Source: https://dblp.org/rdf/
Download: https://zenodo.org/records/7638511
Paper: DBLP-QuAD (ECIR 2023)
Stats: 252M triples, 92M entities, 62 relations

Supported Technologies

Vector Databases

ChromaDB ✅ (implemented)
Pinecone (planned)
Qdrant (planned)

Embedding Models

all-MiniLM-L6-v2 ✅ (default, 384 dimensions)
google/embeddinggemma-300m (alternative)

License

MIT

Testing

Run All Tests

python tests/test_runner.py

Run Specific Tests

# Run single test file
python tests/test_runner.py chromadb

# Run multiple test files
python tests/test_runner.py chromadb seed_finder subgraph_extraction

# List available tests
python tests/test_runner.py --list

# Run standalone test file
python tests/test_chromadb.py

Prerequisites

Oxigraph SPARQL server running at http://localhost:7878/query
OPENAI_API_KEY environment variable set
ChromaDB created (happens automatically on first run)

Project details

Release history Release notifications | RSS feed

0.3.0

May 19, 2026

0.2.0

Mar 17, 2026

This version

0.1.1

Nov 28, 2025

0.1.0

Nov 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kgnode-0.1.1.tar.gz (44.0 kB view details)

Uploaded Nov 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kgnode-0.1.1-py3-none-any.whl (53.5 kB view details)

Uploaded Nov 28, 2025 Python 3

File details

Details for the file kgnode-0.1.1.tar.gz.

File metadata

Download URL: kgnode-0.1.1.tar.gz
Upload date: Nov 28, 2025
Size: 44.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.23

File hashes

Hashes for kgnode-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`aeac421c6f60773d992630611f1a263e8b0b9019e4c87e52ca064a661b8980f4`
MD5	`8e5ad4c8b7225b9774fbca64f53d55d4`
BLAKE2b-256	`fbf6610515963f8f93efc450ab363d62b01000582f4c6f881aff071ceef04aee`

See more details on using hashes here.

File details

Details for the file kgnode-0.1.1-py3-none-any.whl.

File metadata

Download URL: kgnode-0.1.1-py3-none-any.whl
Upload date: Nov 28, 2025
Size: 53.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.23

File hashes

Hashes for kgnode-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86ba017d2d7a2aa1cb22aea0c521e01aa94e87085c4d1a0c1b6ac11fb9a077a3`
MD5	`f8a9684607450c93f24bea484a98b657`
BLAKE2b-256	`f46f28ca5fca3461c05d748ae60ae08bff53a711db0cdbd50132a590ae51d745`

See more details on using hashes here.

kgnode 0.1.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

kgnode

Overview

Installation

Quick Start

Folder Structure

Running Oxigraph SPARQL Server

Public API

Main Pipeline

VectorDB Operations

Search Operations

Validation

Core Configuration

TODOs

LangGraph Integration

Documentation

Dataset

Supported Technologies

Vector Databases

Embedding Models

License

Testing

Run All Tests

Run Specific Tests

Prerequisites

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes