Skip to main content

A minimal GraphRAG implementation in ~600 lines of Python code

Project description

GraphRAG-Lite

GraphRAG-Lite Logo

Minimal GraphRAG implementation in ~500 lines of Python code.

PyPI version Python 3.10+ License: Apache 2.0 Chat Group

中文文档

GraphRAG-Lite is a clean, educational implementation of GraphRAG (Graph-based Retrieval-Augmented Generation). Perfect for learning the core principles of knowledge graph enhanced RAG systems.

Why GraphRAG-Lite?

  • Learn by Reading: Clean, well-documented code you can understand in an afternoon
  • Production Patterns: Real-world optimizations like batch embeddings and LLM caching
  • Flexible Retrieval: 4 query modes for different use cases
  • Minimal Dependencies: Just openai, numpy, tiktoken, and loguru

Features

Feature Description
4 Query Modes local, global, mix, naive - choose the right strategy
Batch Embeddings Reduce API calls with intelligent batching
LLM Caching Avoid redundant LLM requests
Streaming Output Real-time response streaming
NumPy Acceleration Fast vector similarity search
Persistent Storage JSON-based storage, no external database needed

Installation

pip install graphrag-lite

Or install from source:

git clone https://github.com/shibing624/graphrag-lite.git
cd graphrag-lite
pip install -e .

Quick Start

import os
from graphrag_lite import GraphRAGLite

# Initialize
graph = GraphRAGLite(
    storage_path="./my_graph",
    api_key=os.getenv("OPENAI_API_KEY"),
    base_url=os.getenv("OPENAI_BASE_URL"),  # Optional: for compatible APIs
)

# Insert documents
graph.insert("""
Charles Dickens wrote "A Christmas Carol" in 1843.
The story features Ebenezer Scrooge, a miserly old man,
and the ghost of his former business partner Jacob Marley.
""")

# Query with knowledge graph context
answer = graph.query("What is the relationship between Scrooge and Marley?")
print(answer)

Query Modes

Mode Strategy Best For
local Entity → Related relations "Who is X?" questions
global Relation → Related entities "How are X and Y related?"
mix Entity + Relation + Chunks General purpose (recommended)
naive Text chunks only Baseline comparison
# Choose the right mode for your question
answer = graph.query("Who is Scrooge?", mode="local")
answer = graph.query("How are Scrooge and Marley connected?", mode="global")
answer = graph.query("Tell me about the story", mode="mix")      # Recommended
answer = graph.query("What happened?", mode="naive")

Streaming Output

for chunk in graph.query("Who is Scrooge?", stream=True):
    print(chunk, end="", flush=True)

API Reference

GraphRAGLite

GraphRAGLite(
    storage_path: str = "./graphrag_data",  # Data storage directory
    api_key: str = None,                     # OpenAI API key
    base_url: str = None,                    # OpenAI-compatible API base URL
    model: str = "gpt-4o-mini",              # LLM model
    embedding_model: str = "text-embedding-3-small",  # Embedding model
    enable_cache: bool = True,               # Enable LLM response caching
)

Methods

Method Description
insert(text, doc_id=None) Insert document and build knowledge graph
query(question, mode="mix", top_k=10, stream=False) Query the knowledge graph
has_data() Check if graph has data
get_stats() Get graph statistics
list_entities() List all entities
list_relations() List all relations
clear() Clear all data

How It Works

GraphRAG-Lite Workflow

Insert Pipeline:

Document → Chunking → LLM Entity Extraction → Batch Embedding → Storage

Query Pipeline:

Question → Vector Search → Context Building → LLM Generation → Answer

Use Cases

  • Learning GraphRAG: Understand how knowledge graphs enhance RAG
  • Prototyping: Quickly validate GraphRAG for your domain
  • Research: Baseline for comparing retrieval strategies
  • Education: Teaching material for RAG concepts

Community & Support

  • GitHub Issues: Submit an issue
  • WeChat: Add xuming624 with note "llm" to join the LLM tech wechat group

License

Apache License 2.0

Citation

@software{graphrag-lite,
  author = {Xu Ming},
  title = {GraphRAG-Lite: Minimal GraphRAG Implementation},
  year = {2025},
  url = {https://github.com/shibing624/graphrag-lite}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_lite-0.1.2.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_lite-0.1.2-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_lite-0.1.2.tar.gz.

File metadata

  • Download URL: graphrag_lite-0.1.2.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.5

File hashes

Hashes for graphrag_lite-0.1.2.tar.gz
Algorithm Hash digest
SHA256 dbb4e95b209c3a8e6f3b3df46152f0d46d17ed0394f36cb81aa07bf2fb0606ac
MD5 e3b0b3186f8cc5fb2475b466cb8d42f8
BLAKE2b-256 f2f8ec0c2f20f152eed10f91859bf0a3a6f3238d6179871d0e491ca87f52b3da

See more details on using hashes here.

File details

Details for the file graphrag_lite-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: graphrag_lite-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.5

File hashes

Hashes for graphrag_lite-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 138e4a0c3f4e6f842e80fe989339eeaa116a7db0e0d773fdd5003d8038a17c0e
MD5 1aebe7c06d170e4fe4e0cb802a1b3693
BLAKE2b-256 b1eb2321480f8e7fea989003ec77eadcc59898d818c50cf7e85ab97d92c56214

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page