Skip to main content

The most accurate Graph RAG framework. Build knowledge graphs and query them with natural language. Built on FalkorDB.

Project description

GraphRAG SDK

The most accurate Graph RAG framework. Built on FalkorDB.

Python 3.10+ License: Apache 2.0 Version: 1.0.0 Tests: 582 passing

GraphRAG SDK builds knowledge graphs from documents and answers questions over them using retrieval-augmented generation. Every algorithmic concern (chunking, extraction, resolution, retrieval, reranking) is a swappable strategy behind an abstract interface. The default pipeline scores ~85% accuracy on a 100-question benchmark using GPT-4.1.

Quick Start

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        result = await rag.ingest("my_document.txt")
        print(f"Created {result.nodes_created} nodes, {result.relationships_created} edges")

        answer = await rag.completion("What is the main theme?")
        print(answer.answer)

asyncio.run(main())

Installation

pip install graphrag-sdk[litellm]       # OpenAI, Azure, Anthropic, 100+ models
pip install graphrag-sdk[openrouter]    # OpenRouter models
pip install graphrag-sdk[pdf]           # PDF ingestion
pip install graphrag-sdk[all]           # Everything

Prerequisites

  • Python >= 3.10
  • FalkorDB: docker run -p 6379:6379 falkordb/falkordb
  • An LLM API key (OpenAI, Azure OpenAI, OpenRouter, etc.)

Usage

Ingest & Query

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        await rag.ingest("report.pdf")                              # PDF
        await rag.ingest("source_id", text="Alice works at Acme.")  # Raw text
        await rag.finalize()                                         # Dedup + index

        # Retrieve context only
        context = await rag.retrieve("Where does Alice work?")

        # Full RAG: retrieve + generate answer
        result = await rag.completion("Where does Alice work?")
        print(result.answer)

asyncio.run(main())

Multi-Turn Conversations

completion() supports multi-turn conversations. With the built-in providers (LiteLLM, OpenRouterLLM), messages are passed natively to the LLM's chat API. Custom providers that only implement invoke() get automatic fallback via message concatenation.

from graphrag_sdk import ChatMessage

answer = await rag.completion(
    "What happened next?",
    history=[
        ChatMessage(role="user", content="Who is Alice?"),
        ChatMessage(role="assistant", content="Alice is an engineer at Acme Corp."),
    ],
)

Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.

Schema Definition

from graphrag_sdk import GraphSchema, EntityType, RelationType

schema = GraphSchema(
    entities=[
        EntityType(label="Person", description="A human being"),
        EntityType(label="Organization", description="A company or institution"),
    ],
    relations=[
        RelationType(
            label="WORKS_AT",
            description="Is employed by",
            patterns=[("Person", "Organization")],
        ),
    ],
)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)  # conn, llm, embedder from above

Strategy Customization

Override any pipeline step by passing a strategy:

from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
from graphrag_sdk import GraphExtraction, LLMExtractor
from graphrag_sdk.ingestion.resolution_strategies import SemanticResolution

# Custom chunking
await rag.ingest("doc.txt", chunker=FixedSizeChunking(chunk_size=1500, chunk_overlap=200))

# LLM-based entity extraction instead of GLiNER
await rag.ingest("doc.txt", extractor=GraphExtraction(llm=llm, entity_extractor=LLMExtractor(llm)))

Strategy Reference

Every algorithmic concern is a swappable strategy behind an abstract base class:

Concern ABC Built-in Options Default
Loading LoaderStrategy TextLoader, PdfLoader Auto-detect by extension
Chunking ChunkingStrategy FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking FixedSizeChunking
Extraction ExtractionStrategy GraphExtraction (GLiNER2 + LLM) GraphExtraction
Resolution ResolutionStrategy ExactMatchResolution, DescriptionMergeResolution, SemanticResolution, LLMVerifiedResolution ExactMatch
Retrieval RetrievalStrategy LocalRetrieval, MultiPathRetrieval MultiPath (5-path)
Reranking RerankingStrategy CosineReranker Cosine

LLM & Embedding Providers

Provider LLM Class Embedder Class Models
LiteLLM LiteLLM LiteLLMEmbedder OpenAI, Azure, Anthropic, Cohere, 100+
OpenRouter OpenRouterLLM OpenRouterEmbedder All OpenRouter models
Custom Subclass LLMInterface Subclass Embedder Anything

Benchmark

#1 on GraphRAG-Bench Novel — 63.73 ACC, ahead of MS-GraphRAG (50.93) and LightRAG (45.09).

Metric Value
Novel ACC 63.73 (#1)
Fact retrieval 65.22
Complex reasoning 58.63
Contextual summarization 69.54
Creative generation 57.08
Questions 2,010 across 20 novels

See docs/benchmark.md for methodology and reproduction.

Examples

# Example Description
1 01_quickstart.py Minimal ingest & query
2 02_pdf_with_schema.py PDF with custom schema
3 03_custom_strategies.py Benchmark-winning pipeline
4 04_custom_provider.py Custom LLM/Embedder
5 05_notebook_demo.ipynb Interactive notebook walkthrough

Documentation

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphrag_sdk-1.0.2.tar.gz (159.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphrag_sdk-1.0.2-py3-none-any.whl (130.9 kB view details)

Uploaded Python 3

File details

Details for the file graphrag_sdk-1.0.2.tar.gz.

File metadata

  • Download URL: graphrag_sdk-1.0.2.tar.gz
  • Upload date:
  • Size: 159.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.2.tar.gz
Algorithm Hash digest
SHA256 ed50583ea2fa80f322867cc47b0797cd40a7e68328b90cdee6ac5205b6893cd2
MD5 8066e6ac850d2878d07d315b7ca2e251
BLAKE2b-256 8f2c0349f2c0a61a5140ebad69693f4d66b3def621b26da695eee5e58a68fc5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.2.tar.gz:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphrag_sdk-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: graphrag_sdk-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 130.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphrag_sdk-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1596b421024f9f2c99a70ebc44adfdcc43fb5c18edeed8725708a4b4a5c878a7
MD5 38bff329ba2e7b1f073410bc685bde76
BLAKE2b-256 63c051985bbfdf9997777661c0d17447dc85e7b2f76b2a0b0d5ae937ba6f8515

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphrag_sdk-1.0.2-py3-none-any.whl:

Publisher: pypi-publish.yaml on FalkorDB/GraphRAG-SDK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page