Skip to main content

The most accurate Graph RAG framework. Build knowledge graphs and query them with natural language. Built on FalkorDB.

Project description

GraphRAG SDK

The most accurate Graph RAG framework. Built on FalkorDB.

Python 3.10+ License: Apache 2.0 Version: 1.0.0 Tests: 582 passing

GraphRAG SDK builds knowledge graphs from documents and answers questions over them using retrieval-augmented generation. Every algorithmic concern (chunking, extraction, resolution, retrieval, reranking) is a swappable strategy behind an abstract interface. The default pipeline scores ~85% accuracy on a 100-question benchmark using GPT-4.1.

Quick Start

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        result = await rag.ingest("my_document.txt")
        print(f"Created {result.nodes_created} nodes, {result.relationships_created} edges")

        answer = await rag.completion("What is the main theme?")
        print(answer.answer)

asyncio.run(main())

Installation

pip install graphrag-sdk[litellm]       # OpenAI, Azure, Anthropic, 100+ models
pip install graphrag-sdk[openrouter]    # OpenRouter models
pip install graphrag-sdk[pdf]           # PDF ingestion
pip install graphrag-sdk[all]           # Everything

Prerequisites

  • Python >= 3.10
  • FalkorDB: docker run -p 6379:6379 falkordb/falkordb
  • An LLM API key (OpenAI, Azure OpenAI, OpenRouter, etc.)

Usage

Ingest & Query

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        await rag.ingest("report.pdf")                              # PDF
        await rag.ingest("source_id", text="Alice works at Acme.")  # Raw text
        await rag.finalize()                                         # Dedup + index

        # Retrieve context only
        context = await rag.retrieve("Where does Alice work?")

        # Full RAG: retrieve + generate answer
        result = await rag.completion("Where does Alice work?")
        print(result.answer)

asyncio.run(main())

Multi-Turn Conversations

completion() supports multi-turn conversations. With the built-in providers (LiteLLM, OpenRouterLLM), messages are passed natively to the LLM's chat API. Custom providers that only implement invoke() get automatic fallback via message concatenation.

from graphrag_sdk import ChatMessage

answer = await rag.completion(
    "What happened next?",
    history=[
        ChatMessage(role="user", content="Who is Alice?"),
        ChatMessage(role="assistant", content="Alice is an engineer at Acme Corp."),
    ],
)

Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.

Schema Definition

from graphrag_sdk import GraphSchema, EntityType, RelationType

schema = GraphSchema(
    entities=[
        EntityType(label="Person", description="A human being"),
        EntityType(label="Organization", description="A company or institution"),
    ],
    relations=[
        RelationType(
            label="WORKS_AT",
            description="Is employed by",
            patterns=[("Person", "Organization")],
        ),
    ],
)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)  # conn, llm, embedder from above

Strategy Customization

Override any pipeline step by passing a strategy:

from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
from graphrag_sdk import GraphExtraction, LLMExtractor
from graphrag_sdk.ingestion.resolution_strategies import SemanticResolution

# Custom chunking
await rag.ingest("doc.txt", chunker=FixedSizeChunking(chunk_size=1500, chunk_overlap=200))

# LLM-based entity extraction instead of GLiNER
await rag.ingest("doc.txt", extractor=GraphExtraction(llm=llm, entity_extractor=LLMExtractor(llm)))

Strategy Reference

Every algorithmic concern is a swappable strategy behind an abstract base class:

Concern ABC Built-in Options Default
Loading LoaderStrategy TextLoader, PdfLoader Auto-detect by extension
Chunking ChunkingStrategy FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking FixedSizeChunking
Extraction ExtractionStrategy GraphExtraction (GLiNER2 + LLM) GraphExtraction
Resolution ResolutionStrategy ExactMatchResolution, DescriptionMergeResolution, SemanticResolution, LLMVerifiedResolution ExactMatch
Retrieval RetrievalStrategy LocalRetrieval, MultiPathRetrieval MultiPath (5-path)
Reranking RerankingStrategy CosineReranker Cosine

LLM & Embedding Providers

Provider LLM Class Embedder Class Models
LiteLLM LiteLLM LiteLLMEmbedder OpenAI, Azure, Anthropic, Cohere, 100+
OpenRouter OpenRouterLLM OpenRouterEmbedder All OpenRouter models
Custom Subclass LLMInterface Subclass Embedder Anything

Benchmark

#1 on GraphRAG-Bench Novel — 63.73 ACC, ahead of MS-GraphRAG (50.93) and LightRAG (45.09).

Metric Value
Novel ACC 63.73 (#1)
Fact retrieval 65.22
Complex reasoning 58.63
Contextual summarization 69.54
Creative generation 57.08
Questions 2,010 across 20 novels

See docs/benchmark.md for methodology and reproduction.

Examples

# Example Description
1 01_quickstart.py Minimal ingest & query
2 02_pdf_with_schema.py PDF with custom schema
3 03_custom_strategies.py Benchmark-winning pipeline
4 04_custom_provider.py Custom LLM/Embedder
5 05_notebook_demo.ipynb Interactive notebook walkthrough

Documentation

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_sdk-1.0.0.tar.gz (142.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_sdk-1.0.0-py3-none-any.whl (122.5 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_sdk-1.0.0.tar.gz.

File metadata

  • Download URL: graphrag_sdk-1.0.0.tar.gz
  • Upload date:
  • Size: 142.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 479ea9927671ad06d0a8015c387dc346f6ea892dba7bb503a1421a33f51ca8ad
MD5 3aed66db8452fb70058bb23f1ed11781
BLAKE2b-256 293d3fe313ba74365246f52e96a50ddb272d3f9c26c93115003140f4623bec05

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.0.tar.gz:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_sdk-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: graphrag_sdk-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 122.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51cbbc4e46cee9c4ee33df4c100e7c9157880e70118ede0611f2aa376fc00ad4
MD5 8961944d9825f9830821cc94e9b88e5f
BLAKE2b-256 ea4dad10fa1e4458551446f9b6421b4e13f4decac6212471562b35048a0a56c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.0-py3-none-any.whl:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page