A minimal GraphRAG implementation in ~600 lines of Python code
Project description
Lightweight GraphRAG implementation with sync/async APIs and knowledge traceability
GraphRAG-Lite is a lightweight, educational implementation of GraphRAG (Graph-based Retrieval-Augmented Generation). Perfect for learning the core principles of knowledge graph enhanced RAG systems.
Why GraphRAG-Lite?
- Learn by Reading: Clean, well-documented code you can understand in an afternoon
- Production Patterns: Real-world optimizations like batch embeddings and LLM caching
- Sync/Async APIs: Both synchronous and asynchronous methods for different use cases
- Knowledge Traceability: Answers include citations to knowledge graph sources
- Minimal Dependencies: Just
openai,numpy,tiktoken,loguru, andtqdm
Features
| Feature | Description |
|---|---|
| 4 Query Modes | local, global, mix, naive - choose the right strategy |
| Sync/Async APIs | insert/ainsert, query/aquery dual-mode support |
| Knowledge Traceability | Answers with [Entities (X); Relationships (Y)] citations |
| Batch Embeddings | Reduce API calls with intelligent batching |
| Streaming Output | Real-time response streaming (sync and async) |
| Persistent Storage | JSON-based storage, no external database needed |
Installation
pip install graphrag-lite
Or install from source:
git clone https://github.com/shibing624/graphrag-lite.git
cd graphrag-lite
pip install -e .
Quick Start
Synchronous Mode
import os
from graphrag_lite import GraphRAGLite
# Initialize
graph = GraphRAGLite(
storage_path="./my_graph",
api_key=os.getenv("OPENAI_API_KEY"),
base_url=os.getenv("OPENAI_BASE_URL"), # Optional: for compatible APIs
)
# Insert documents
graph.insert("""
Charles Dickens wrote "A Christmas Carol" in 1843.
The story features Ebenezer Scrooge, a miserly old man,
and the ghost of his former business partner Jacob Marley.
""")
# Query with knowledge graph context
answer = graph.query("What is the relationship between Scrooge and Marley?")
print(answer)
Asynchronous Mode (Recommended for Large Documents)
import asyncio
from graphrag_lite import GraphRAGLite
async def main():
graph = GraphRAGLite(storage_path="./my_graph")
# Async insert (with progress bar)
await graph.ainsert(long_document, show_progress=True)
# Async query
answer = await graph.aquery("What is the question?")
print(answer)
# Async streaming
stream = await graph.aquery("What is the question?", stream=True)
async for chunk in stream:
print(chunk, end="", flush=True)
asyncio.run(main())
Query Modes
| Mode | Strategy | Best For |
|---|---|---|
local |
Entity → Related relations | "Who is X?" questions |
global |
Relation → Related entities | "How are X and Y related?" |
mix |
Entity + Relation + Chunks | General purpose (recommended) |
naive |
Text chunks only | Baseline comparison |
# Choose the right mode for your question
answer = graph.query("Who is Scrooge?", mode="local")
answer = graph.query("How are Scrooge and Marley connected?", mode="global")
answer = graph.query("Tell me about the story", mode="mix") # Recommended
answer = graph.query("What happened?", mode="naive")
Knowledge Traceability
Answers automatically include citations to knowledge graph sources for credibility:
Ebenezer Scrooge is the main character of "A Christmas Carol" [Entities (0)].
He was the business partner of Jacob Marley [Relationships (1, 2)].
Streaming Output
# Sync streaming
for chunk in graph.query("Who is Scrooge?", stream=True):
print(chunk, end="", flush=True)
# Async streaming
stream = await graph.aquery("Who is Scrooge?", stream=True)
async for chunk in stream:
print(chunk, end="", flush=True)
API Reference
GraphRAGLite
GraphRAGLite(
storage_path: str = "./graphrag_data", # Data storage directory
api_key: str = None, # OpenAI API key
base_url: str = None, # OpenAI-compatible API base URL
model: str = "gpt-4o-mini", # LLM model
embedding_model: str = "text-embedding-3-small", # Embedding model
enable_cache: bool = True, # Enable LLM response caching
)
Methods
| Method | Description |
|---|---|
insert(text, doc_id=None) |
Sync insert document |
ainsert(text, doc_id=None, show_progress=True) |
Async insert document (with progress bar) |
query(question, mode="mix", top_k=10, stream=False) |
Sync query |
aquery(question, mode="mix", top_k=10, stream=False) |
Async query |
local_search(query, top_k) |
Search from entities → related relations |
global_search(query, top_k) |
Search from relations → related entities |
mix_search(query, top_k) |
Search entities + relations + text chunks |
naive_search(query, top_k) |
Search text chunks only |
has_data() |
Check if graph has data |
get_stats() |
Get graph statistics |
list_entities() |
List all entities |
list_relations() |
List all relations |
clear() |
Clear all data |
How It Works
Insert Pipeline:
Document → Chunking → LLM Entity Extraction → Batch Embedding → Storage
Query Pipeline:
Question → Vector Search → Context Building → LLM Generation (with citations) → Answer
Use Cases
- Learning GraphRAG: Understand how knowledge graphs enhance RAG
- Prototyping: Quickly validate GraphRAG for your domain
- Research: Baseline for comparing retrieval strategies
- Education: Teaching material for RAG concepts
Community & Support
- GitHub Issues: Submit an issue
- WeChat: Add
xuming624with note "llm" to join the LLM tech wechat group
License
Apache License 2.0
Citation
@software{graphrag-lite,
author = {Xu Ming},
title = {GraphRAG-Lite: Lightweight GraphRAG Implementation},
year = {2026},
url = {https://github.com/shibing624/graphrag-lite}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphrag_lite-0.1.3.tar.gz.
File metadata
- Download URL: graphrag_lite-0.1.3.tar.gz
- Upload date:
- Size: 23.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f88a3acd1b678ae5081c64238c09cf56ab713a06085cb8259fe218553da5adde
|
|
| MD5 |
028813db27435847353c219b71a1b3a9
|
|
| BLAKE2b-256 |
a905847004417f3dbf8859c91ba614e009059857a6292cd4a2ada529cec22f5c
|
File details
Details for the file graphrag_lite-0.1.3-py3-none-any.whl.
File metadata
- Download URL: graphrag_lite-0.1.3-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
393b12d8659b42240736a5e17dd013d55781339f0c169c906ef02864120d5233
|
|
| MD5 |
eb9da6ecd244f860f4e9885dd5749681
|
|
| BLAKE2b-256 |
434389227a49d5e286cf71bbe6e0aea4633248ee38534b7abf0104d4ed39b7de
|