Skip to main content

The most accurate Graph RAG framework. Build knowledge graphs and query them with natural language. Built on FalkorDB.

Project description

GraphRAG SDK

The most accurate Graph RAG framework. Built on FalkorDB.

Python 3.10+ License: Apache 2.0 Version: 1.0.0 Tests: 582 passing

GraphRAG SDK builds knowledge graphs from documents and answers questions over them using retrieval-augmented generation. Every algorithmic concern (chunking, extraction, resolution, retrieval, reranking) is a swappable strategy behind an abstract interface. The default pipeline scores ~85% accuracy on a 100-question benchmark using GPT-4.1.

Quick Start

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        result = await rag.ingest("my_document.txt")
        print(f"Created {result.nodes_created} nodes, {result.relationships_created} edges")

        answer = await rag.completion("What is the main theme?")
        print(answer.answer)

asyncio.run(main())

Installation

pip install graphrag-sdk[litellm]       # OpenAI, Azure, Anthropic, 100+ models
pip install graphrag-sdk[openrouter]    # OpenRouter models
pip install graphrag-sdk[pdf]           # PDF ingestion
pip install graphrag-sdk[all]           # Everything

Prerequisites

  • Python >= 3.10
  • FalkorDB: docker run -p 6379:6379 falkordb/falkordb
  • An LLM API key (OpenAI, Azure OpenAI, OpenRouter, etc.)

Usage

Ingest & Query

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        await rag.ingest("report.pdf")                              # PDF
        await rag.ingest("source_id", text="Alice works at Acme.")  # Raw text
        await rag.finalize()                                         # Dedup + index

        # Retrieve context only
        context = await rag.retrieve("Where does Alice work?")

        # Full RAG: retrieve + generate answer
        result = await rag.completion("Where does Alice work?")
        print(result.answer)

asyncio.run(main())

Multi-Turn Conversations

completion() supports multi-turn conversations. With the built-in providers (LiteLLM, OpenRouterLLM), messages are passed natively to the LLM's chat API. Custom providers that only implement invoke() get automatic fallback via message concatenation.

from graphrag_sdk import ChatMessage

answer = await rag.completion(
    "What happened next?",
    history=[
        ChatMessage(role="user", content="Who is Alice?"),
        ChatMessage(role="assistant", content="Alice is an engineer at Acme Corp."),
    ],
)

Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.

Schema Definition

from graphrag_sdk import GraphSchema, EntityType, RelationType

schema = GraphSchema(
    entities=[
        EntityType(label="Person", description="A human being"),
        EntityType(label="Organization", description="A company or institution"),
    ],
    relations=[
        RelationType(
            label="WORKS_AT",
            description="Is employed by",
            patterns=[("Person", "Organization")],
        ),
    ],
)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)  # conn, llm, embedder from above

Strategy Customization

Override any pipeline step by passing a strategy:

from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
from graphrag_sdk import GraphExtraction, LLMExtractor
from graphrag_sdk.ingestion.resolution_strategies import SemanticResolution

# Custom chunking
await rag.ingest("doc.txt", chunker=FixedSizeChunking(chunk_size=1500, chunk_overlap=200))

# LLM-based entity extraction instead of GLiNER
await rag.ingest("doc.txt", extractor=GraphExtraction(llm=llm, entity_extractor=LLMExtractor(llm)))

Strategy Reference

Every algorithmic concern is a swappable strategy behind an abstract base class:

Concern ABC Built-in Options Default
Loading LoaderStrategy TextLoader, PdfLoader Auto-detect by extension
Chunking ChunkingStrategy FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking FixedSizeChunking
Extraction ExtractionStrategy GraphExtraction (GLiNER2 + LLM) GraphExtraction
Resolution ResolutionStrategy ExactMatchResolution, DescriptionMergeResolution, SemanticResolution, LLMVerifiedResolution ExactMatch
Retrieval RetrievalStrategy LocalRetrieval, MultiPathRetrieval MultiPath (5-path)
Reranking RerankingStrategy CosineReranker Cosine

LLM & Embedding Providers

Provider LLM Class Embedder Class Models
LiteLLM LiteLLM LiteLLMEmbedder OpenAI, Azure, Anthropic, Cohere, 100+
OpenRouter OpenRouterLLM OpenRouterEmbedder All OpenRouter models
Custom Subclass LLMInterface Subclass Embedder Anything

Benchmark

#1 on GraphRAG-Bench Novel — 63.73 ACC, ahead of MS-GraphRAG (50.93) and LightRAG (45.09).

Metric Value
Novel ACC 63.73 (#1)
Fact retrieval 65.22
Complex reasoning 58.63
Contextual summarization 69.54
Creative generation 57.08
Questions 2,010 across 20 novels

See docs/benchmark.md for methodology and reproduction.

Examples

# Example Description
1 01_quickstart.py Minimal ingest & query
2 02_pdf_with_schema.py PDF with custom schema
3 03_custom_strategies.py Benchmark-winning pipeline
4 04_custom_provider.py Custom LLM/Embedder
5 05_notebook_demo.ipynb Interactive notebook walkthrough

Documentation

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_sdk-1.0.1.tar.gz (158.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_sdk-1.0.1-py3-none-any.whl (130.7 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_sdk-1.0.1.tar.gz.

File metadata

  • Download URL: graphrag_sdk-1.0.1.tar.gz
  • Upload date:
  • Size: 158.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.1.tar.gz
Algorithm Hash digest
SHA256 1702a1059b285dfdc9cbf97eeeb786bae69e84230ee47c942c123fe95f6adc9f
MD5 6024cf48e1bc27c42fe1b20675191a9e
BLAKE2b-256 f6feeff9c4c56605cc9aa45283da2af7599f5f16d0aec2d8b642348afad146df

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.1.tar.gz:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_sdk-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: graphrag_sdk-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 130.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 957d69fa3145bfcec7d8170bcb036fa14435131aa2bb28d8fe5a419088d21ff5
MD5 0bc91941989b9c068c23021fe4ae3aec
BLAKE2b-256 12dcaf4baa428af077c1a258d28dfc05bfb324c303931e083ea47d1b2d90e2b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.1-py3-none-any.whl:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page