The most accurate Graph RAG framework. Build knowledge graphs and query them with natural language. Built on FalkorDB.
Project description
GraphRAG SDK
The most accurate Graph RAG framework. Built on FalkorDB.
GraphRAG SDK builds knowledge graphs from documents and answers questions over them using retrieval-augmented generation. Every algorithmic concern (chunking, extraction, resolution, retrieval, reranking) is a swappable strategy behind an abstract interface. The default pipeline scores ~85% accuracy on a 100-question benchmark using GPT-4.1.
Quick Start
import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder
async def main():
async with GraphRAG(
connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
llm=LiteLLM(model="openai/gpt-4o"),
embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
) as rag:
result = await rag.ingest("my_document.txt")
print(f"Created {result.nodes_created} nodes, {result.relationships_created} edges")
answer = await rag.completion("What is the main theme?")
print(answer.answer)
asyncio.run(main())
Installation
pip install graphrag-sdk[litellm] # OpenAI, Azure, Anthropic, 100+ models
pip install graphrag-sdk[openrouter] # OpenRouter models
pip install graphrag-sdk[pdf] # PDF ingestion
pip install graphrag-sdk[all] # Everything
Prerequisites
- Python >= 3.10
- FalkorDB:
docker run -p 6379:6379 falkordb/falkordb - An LLM API key (OpenAI, Azure OpenAI, OpenRouter, etc.)
Usage
Ingest & Query
import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder
async def main():
async with GraphRAG(
connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
llm=LiteLLM(model="openai/gpt-4o"),
embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
) as rag:
await rag.ingest("report.pdf") # PDF
await rag.ingest("source_id", text="Alice works at Acme.") # Raw text
await rag.finalize() # Dedup + index
# Retrieve context only
context = await rag.retrieve("Where does Alice work?")
# Full RAG: retrieve + generate answer
result = await rag.completion("Where does Alice work?")
print(result.answer)
asyncio.run(main())
Multi-Turn Conversations
completion() supports multi-turn conversations. With the built-in providers (LiteLLM, OpenRouterLLM), messages are passed natively to the LLM's chat API. Custom providers that only implement invoke() get automatic fallback via message concatenation.
from graphrag_sdk import ChatMessage
answer = await rag.completion(
"What happened next?",
history=[
ChatMessage(role="user", content="Who is Alice?"),
ChatMessage(role="assistant", content="Alice is an engineer at Acme Corp."),
],
)
Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.
Schema Definition
from graphrag_sdk import GraphSchema, EntityType, RelationType
schema = GraphSchema(
entities=[
EntityType(label="Person", description="A human being"),
EntityType(label="Organization", description="A company or institution"),
],
relations=[
RelationType(
label="WORKS_AT",
description="Is employed by",
patterns=[("Person", "Organization")],
),
],
)
rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema) # conn, llm, embedder from above
Strategy Customization
Override any pipeline step by passing a strategy:
from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
from graphrag_sdk import GraphExtraction, LLMExtractor
from graphrag_sdk.ingestion.resolution_strategies import SemanticResolution
# Custom chunking
await rag.ingest("doc.txt", chunker=FixedSizeChunking(chunk_size=1500, chunk_overlap=200))
# LLM-based entity extraction instead of GLiNER
await rag.ingest("doc.txt", extractor=GraphExtraction(llm=llm, entity_extractor=LLMExtractor(llm)))
Strategy Reference
Every algorithmic concern is a swappable strategy behind an abstract base class:
| Concern | ABC | Built-in Options | Default |
|---|---|---|---|
| Loading | LoaderStrategy |
TextLoader, PdfLoader |
Auto-detect by extension |
| Chunking | ChunkingStrategy |
FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking |
FixedSizeChunking |
| Extraction | ExtractionStrategy |
GraphExtraction (GLiNER2 + LLM) |
GraphExtraction |
| Resolution | ResolutionStrategy |
ExactMatchResolution, DescriptionMergeResolution, SemanticResolution, LLMVerifiedResolution |
ExactMatch |
| Retrieval | RetrievalStrategy |
LocalRetrieval, MultiPathRetrieval |
MultiPath (5-path) |
| Reranking | RerankingStrategy |
CosineReranker |
Cosine |
LLM & Embedding Providers
| Provider | LLM Class | Embedder Class | Models |
|---|---|---|---|
| LiteLLM | LiteLLM |
LiteLLMEmbedder |
OpenAI, Azure, Anthropic, Cohere, 100+ |
| OpenRouter | OpenRouterLLM |
OpenRouterEmbedder |
All OpenRouter models |
| Custom | Subclass LLMInterface |
Subclass Embedder |
Anything |
Benchmark
#1 on GraphRAG-Bench Novel — 63.73 ACC, ahead of MS-GraphRAG (50.93) and LightRAG (45.09).
| Metric | Value |
|---|---|
| Novel ACC | 63.73 (#1) |
| Fact retrieval | 65.22 |
| Complex reasoning | 58.63 |
| Contextual summarization | 69.54 |
| Creative generation | 57.08 |
| Questions | 2,010 across 20 novels |
See docs/benchmark.md for methodology and reproduction.
Examples
| # | Example | Description |
|---|---|---|
| 1 | 01_quickstart.py |
Minimal ingest & query |
| 2 | 02_pdf_with_schema.py |
PDF with custom schema |
| 3 | 03_custom_strategies.py |
Benchmark-winning pipeline |
| 4 | 04_custom_provider.py |
Custom LLM/Embedder |
| 5 | 05_notebook_demo.ipynb |
Interactive notebook walkthrough |
Documentation
- Getting Started -- Install to first query
- Architecture -- Pipeline design and graph schema
- Configuration -- Connection and provider reference
- Strategies -- All ABCs and built-in implementations
- Providers -- LLM & embedder configuration
- Benchmark -- Methodology and reproduction
- API Reference -- Full API documentation
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphrag_sdk-1.0.1.tar.gz.
File metadata
- Download URL: graphrag_sdk-1.0.1.tar.gz
- Upload date:
- Size: 158.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1702a1059b285dfdc9cbf97eeeb786bae69e84230ee47c942c123fe95f6adc9f
|
|
| MD5 |
6024cf48e1bc27c42fe1b20675191a9e
|
|
| BLAKE2b-256 |
f6feeff9c4c56605cc9aa45283da2af7599f5f16d0aec2d8b642348afad146df
|
Provenance
The following attestation bundles were made for graphrag_sdk-1.0.1.tar.gz:
Publisher:
pypi-publish.yaml on FalkorDB/GraphRAG-SDK
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphrag_sdk-1.0.1.tar.gz -
Subject digest:
1702a1059b285dfdc9cbf97eeeb786bae69e84230ee47c942c123fe95f6adc9f - Sigstore transparency entry: 1399465087
- Sigstore integration time:
-
Permalink:
FalkorDB/GraphRAG-SDK@6f9b706af00545d96773b01a9af92d36c3784496 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/FalkorDB
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yaml@6f9b706af00545d96773b01a9af92d36c3784496 -
Trigger Event:
release
-
Statement type:
File details
Details for the file graphrag_sdk-1.0.1-py3-none-any.whl.
File metadata
- Download URL: graphrag_sdk-1.0.1-py3-none-any.whl
- Upload date:
- Size: 130.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
957d69fa3145bfcec7d8170bcb036fa14435131aa2bb28d8fe5a419088d21ff5
|
|
| MD5 |
0bc91941989b9c068c23021fe4ae3aec
|
|
| BLAKE2b-256 |
12dcaf4baa428af077c1a258d28dfc05bfb324c303931e083ea47d1b2d90e2b9
|
Provenance
The following attestation bundles were made for graphrag_sdk-1.0.1-py3-none-any.whl:
Publisher:
pypi-publish.yaml on FalkorDB/GraphRAG-SDK
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphrag_sdk-1.0.1-py3-none-any.whl -
Subject digest:
957d69fa3145bfcec7d8170bcb036fa14435131aa2bb28d8fe5a419088d21ff5 - Sigstore transparency entry: 1399465098
- Sigstore integration time:
-
Permalink:
FalkorDB/GraphRAG-SDK@6f9b706af00545d96773b01a9af92d36c3784496 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/FalkorDB
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yaml@6f9b706af00545d96773b01a9af92d36c3784496 -
Trigger Event:
release
-
Statement type: