A minimal GraphRAG implementation in ~600 lines of Python code
Project description
GraphRAG-Lite
Minimal GraphRAG implementation in ~500 lines of Python code.
GraphRAG-Lite is a clean, educational implementation of GraphRAG (Graph-based Retrieval-Augmented Generation). Perfect for learning the core principles of knowledge graph enhanced RAG systems.
Why GraphRAG-Lite?
- Learn by Reading: Clean, well-documented code you can understand in an afternoon
- Production Patterns: Real-world optimizations like batch embeddings and LLM caching
- Flexible Retrieval: 4 query modes for different use cases
- Minimal Dependencies: Just
openai,numpy,tiktoken, andloguru
Features
| Feature | Description |
|---|---|
| 4 Query Modes | local, global, mix, naive - choose the right strategy |
| Batch Embeddings | Reduce API calls with intelligent batching |
| LLM Caching | Avoid redundant LLM requests |
| Streaming Output | Real-time response streaming |
| NumPy Acceleration | Fast vector similarity search |
| Persistent Storage | JSON-based storage, no external database needed |
Installation
pip install graphrag-lite
Or install from source:
git clone https://github.com/shibing624/graphrag-lite.git
cd graphrag-lite
pip install -e .
Quick Start
import os
from graphrag_lite import GraphRAGLite
# Initialize
graph = GraphRAGLite(
storage_path="./my_graph",
api_key=os.getenv("OPENAI_API_KEY"),
base_url=os.getenv("OPENAI_BASE_URL"), # Optional: for compatible APIs
)
# Insert documents
graph.insert("""
Charles Dickens wrote "A Christmas Carol" in 1843.
The story features Ebenezer Scrooge, a miserly old man,
and the ghost of his former business partner Jacob Marley.
""")
# Query with knowledge graph context
answer = graph.query("What is the relationship between Scrooge and Marley?")
print(answer)
Query Modes
| Mode | Strategy | Best For |
|---|---|---|
local |
Entity → Related relations | "Who is X?" questions |
global |
Relation → Related entities | "How are X and Y related?" |
mix |
Entity + Relation + Chunks | General purpose (recommended) |
naive |
Text chunks only | Baseline comparison |
# Choose the right mode for your question
answer = graph.query("Who is Scrooge?", mode="local")
answer = graph.query("How are Scrooge and Marley connected?", mode="global")
answer = graph.query("Tell me about the story", mode="mix") # Recommended
answer = graph.query("What happened?", mode="naive")
Streaming Output
for chunk in graph.query("Who is Scrooge?", stream=True):
print(chunk, end="", flush=True)
API Reference
GraphRAGLite
GraphRAGLite(
storage_path: str = "./graphrag_data", # Data storage directory
api_key: str = None, # OpenAI API key
base_url: str = None, # OpenAI-compatible API base URL
model: str = "gpt-4o-mini", # LLM model
embedding_model: str = "text-embedding-3-small", # Embedding model
enable_cache: bool = True, # Enable LLM response caching
)
Methods
| Method | Description |
|---|---|
insert(text, doc_id=None) |
Insert document and build knowledge graph |
query(question, mode="mix", top_k=10, stream=False) |
Query the knowledge graph |
has_data() |
Check if graph has data |
get_stats() |
Get graph statistics |
list_entities() |
List all entities |
list_relations() |
List all relations |
clear() |
Clear all data |
How It Works
Insert Pipeline:
Document → Chunking → LLM Entity Extraction → Batch Embedding → Storage
Query Pipeline:
Question → Vector Search → Context Building → LLM Generation → Answer
Use Cases
- Learning GraphRAG: Understand how knowledge graphs enhance RAG
- Prototyping: Quickly validate GraphRAG for your domain
- Research: Baseline for comparing retrieval strategies
- Education: Teaching material for RAG concepts
Community & Support
- GitHub Issues: Submit an issue
- WeChat: Add
xuming624with note "llm" to join the LLM tech wechat group
License
Apache License 2.0
Citation
@software{graphrag-lite,
author = {Xu Ming},
title = {GraphRAG-Lite: Minimal GraphRAG Implementation},
year = {2025},
url = {https://github.com/shibing624/graphrag-lite}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphrag_lite-0.1.2.tar.gz.
File metadata
- Download URL: graphrag_lite-0.1.2.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbb4e95b209c3a8e6f3b3df46152f0d46d17ed0394f36cb81aa07bf2fb0606ac
|
|
| MD5 |
e3b0b3186f8cc5fb2475b466cb8d42f8
|
|
| BLAKE2b-256 |
f2f8ec0c2f20f152eed10f91859bf0a3a6f3238d6179871d0e491ca87f52b3da
|
File details
Details for the file graphrag_lite-0.1.2-py3-none-any.whl.
File metadata
- Download URL: graphrag_lite-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
138e4a0c3f4e6f842e80fe989339eeaa116a7db0e0d773fdd5003d8038a17c0e
|
|
| MD5 |
1aebe7c06d170e4fe4e0cb802a1b3693
|
|
| BLAKE2b-256 |
b1eb2321480f8e7fea989003ec77eadcc59898d818c50cf7e85ab97d92c56214
|