Graph RAG with Advanced Integration and Learning — an open-source GraphRAG library with incremental updates, file-level provenance, and pluggable LLM/storage backends.
Project description
One graph engine, two write paths: a queryable knowledge base over your documents, and an agentic memory layer for Claude Code, Codex, and OpenCode.
Vectors: FAISS · LanceDB · ChromaDB | Storage: Local · S3 | LLM endpoints: OpenAI · Anthropic · DeepInfra · Together · Groq · OpenRouter · Ollama · vLLM · SGLang · LM Studio
Install · Quickstart · Modes · CLI · SDK · Backends · Benchmarks · Docs
🇬🇧 English · 🇪🇸 Español
Table of contents
- What is GRAIL
- Two modes, one engine
- Why GRAIL
- How it works
- Installation
- Quickstart
- Python SDK
- Agent skill
- CLI reference
- Search modes
- Storage & vector backends
- Benchmarks
- Documentation
- Acknowledgements
- Author
- License
What is GRAIL
GRAIL (Graph RAG with Advanced Integrations and Learning) is an open-source Python framework that turns text into a queryable knowledge graph — and serves it back through six retrieval modes, including an agentic search loop and a structural recall mode.
What makes GRAIL different from other GraphRAG systems is that the same engine powers two completely different use cases through the same artefacts (parquet + FAISS + NetworkX):
- Knowledge-base mode — point it at a folder of documents, run
grail index, query through any of six search modes. Like vanilla GraphRAG, but with cascade retrieval, agentic search, incremental updates, and honest cost tracking. - Agentic memory mode — embed GRAIL inside an agent (Claude Code, Codex, OpenCode, or any framework that supports the skill format) and let it write observations, entities, and relationships directly. The agent owns the write path; GRAIL keeps the graph coherent and searchable. A ready-to-install agent skill ships with the repo.
Both modes share one schema, one storage layer, and the same five search modes — plus a memory-only recall mode that runs without any LLM call. You can run both write paths against the same project simultaneously.
GRAIL ships with provider-agnostic endpoints: OpenAI, Anthropic, DeepInfra, Together, Groq, OpenRouter, Ollama, vLLM, SGLang, and LM Studio are first-class. Swap providers with a one-line config change.
Two modes, one engine
flowchart TB
subgraph WP["Write paths"]
direction TB
KB["**Knowledge-base mode**<br/>grail index<br/>LLM extracts entities + relationships<br/>from input/ documents"]
MM["**Agentic memory mode**<br/>MemoryProject SDK / skill tools<br/>Agent writes observations + entities<br/>under memories/"]
end
subgraph CORE["GRAIL Core (shared)"]
direction TB
ART["Parquet artefacts · FAISS index · NetworkX graph"]
COM["Communities · Reports · Embeddings · Provenance"]
end
subgraph SM["Search modes (work on both)"]
direction TB
L[local]
C[cascade]
G[global]
D[document]
A[agent]
R["recall<br/><i>memory only</i>"]
end
KB --> ART
MM --> ART
ART --> COM
COM --> L & C & G & D & A & R
style KB fill:#e6f3ff,stroke:#0066cc,color:#000
style MM fill:#fff4e6,stroke:#cc6600,color:#000
style CORE fill:#f5f5f5,stroke:#666,color:#000
style SM fill:#f0f9ff,stroke:#0099cc,color:#000
When to use which
| You want to… | Use mode | Entry point |
|---|---|---|
| Build a Q&A bot over a document corpus | knowledge_base |
grail init my-kb |
| Embed long-term memory in a coding agent (Claude Code, Codex, OpenCode) | memory |
grail init my-mem --memory |
| Mix both — agent writes memories and you index reference docs | both | grail init my-mixed --memory then drop docs into input/ and run grail index |
The contract is binding: everything that works in knowledge-base mode works in memory mode — same parquets, same search code. Only the write path differs.
Why GRAIL
| Capability | What it means | Where it lives |
|---|---|---|
| Two write paths, one schema | Knowledge-base mode and agentic memory mode produce the same parquet artefacts. Every search mode works on either. | grail/core.py, grail/memory/project.py |
| Agent-driven writes | MemoryProject.add_observation(...) lets agents persist memories with explicit entities and relationships. No LLM extraction step — the caller already knows what they meant. |
grail/memory/project.py:190 |
| Folders as communities | In memory mode, the folder path under memories/ (e.g. work/clients/acme/) declares the entity's community. Multi-membership is first-class. |
grail/memory/project.py |
consolidate() as a proposal generator |
Runs Leiden + edge-density + alias-detection + folder-split analyses and emits a structured proposal file. The agent reviews; nothing mutates without consent. | grail/memory/consolidate.py, grail/memory/analyses/ |
| Cascade search | Hybrid entity-gate + BM25/cosine text rescue. Recovers facts that live in chunks where no entity was extracted. | grail/query/cascade_search.py |
| Agent search | LLM-driven tool-calling loop that picks between local / cascade / global / document per question, with forced-synthesis fallback. | grail/query/agent.py |
| Recall mode | Pure pandas filter over observed_at / category / tag / entity / confidence — zero LLM, zero embedding. Built for memory replay. |
grail/query/recall_search.py, grail/query/recall_filter.py |
| Incremental updates | append / edit / delete re-extract only affected text units. A change-ratio scheduler (threshold 0.3) updates communities without a full rebuild. |
grail/indexing/incremental_community.py |
| Single-pass extraction (KB mode) | Entities + relationships + descriptions + 2–3 anticipated retrieval queries in one LLM call per chunk — significantly fewer round-trips than systems that run separate passes. | grail/indexing/entities_relationships.py |
| Retrieval queries on entities | Each entity stores 2–3 anticipated user questions in its embedding text. Improves cross-lingual and intent-based matching. | grail/indexing/entities_relationships.py |
| Typed relationships | Optional LLM classification of edges (REGULATES, FUNDS, …). Three modes: disabled, free, constrained vocabulary. |
grail/config.py:IndexingConfig |
| File-level provenance | Every text unit retains back-pointers to source files. Citations reference real documents. | grail/indexing/loader.py |
| Honest cost tracking | Distinguishes complete / partial / undefined pricing. Never reports a fake $0.00 when pricing is unknown. |
grail/llm/cost.py |
Unified Reply envelope |
Every MemoryProject method returns Reply(ok, data, warnings, next_steps, error) — same contract for the SDK and the agent-skill scripts. |
grail/memory/types.py |
| Multi-backend storage & vectors | Local or S3 storage; FAISS (default cosine), LanceDB, or ChromaDB vectors. | grail/storage/, grail/vectorstores/ |
| Two reference UIs | Textual terminal chat and FastAPI + React web chat — both embed GRAIL as a library. | grail/apps/cli_chat/, grail/apps/chat/ |
How it works
Knowledge-base indexing pipeline
flowchart LR
A["input/<br/>.txt .md .pdf .docx code"] --> B[preprocess<br/><i>PDF/DOCX → markdown</i>]
B --> C[FileLoader<br/><i>chunk + provenance</i>]
C --> D[EntityRelationshipExtractor<br/><i>single-pass LLM</i>]
D --> E[(entities.parquet<br/>relationships.parquet)]
E --> F[Leiden<br/><i>hierarchical + DBSCAN merge</i>]
F --> G[(communities.parquet)]
G --> H[CommunityReportGenerator<br/><i>LLM narratives + JSON correction</i>]
H --> I[(community_reports.parquet)]
E --> J[Embedder<br/><i>FAISS cosine</i>]
J --> K[(vector index)]
style A fill:#e6f3ff,color:#000
style E fill:#fffce6,color:#000
style G fill:#fffce6,color:#000
style I fill:#fffce6,color:#000
style K fill:#fffce6,color:#000
Memory-mode write path
flowchart LR
A["Agent<br/>(Claude Code, Codex, OpenCode)"] -->|tool call| B[MemoryProject<br/>.add_observation]
B --> C[Write markdown<br/>memories/<category>/slug.md]
B --> D[FileLoader chunks<br/>this file only]
B --> E[Merge agent-supplied<br/>entities + relationships]
D --> F[(text_units.parquet)]
E --> G[(entities.parquet<br/>relationships.parquet)]
F --> H[Vector index]
G --> H
I["Agent calls<br/>.consolidate()"] --> J[Proposal generator<br/><i>Leiden + alias + density + split</i>]
J --> K[(proposal_set.json)]
K --> L["Agent reviews<br/>accept / reject"]
L --> G
style A fill:#fff4e6,color:#000
style I fill:#fff4e6,color:#000
style L fill:#fff4e6,color:#000
style F fill:#fffce6,color:#000
style G fill:#fffce6,color:#000
style H fill:#fffce6,color:#000
style K fill:#fffce6,color:#000
Query flow (works on both modes)
flowchart TB
Q[Query] --> M{mode?}
M -->|local| L[Embed → top-k entities<br/>→ assemble context → LLM]
M -->|cascade| C[Entity gate<br/>+ BM25/cosine text rescue<br/>→ combined rank → LLM]
M -->|global| G[Map-reduce over<br/>community reports]
M -->|document| D[Scope to one file<br/>→ context → LLM]
M -->|agent| A[LLM tool loop:<br/>local / cascade / global / document<br/>up to N iterations]
M -->|recall| R[Pandas filter<br/><i>no LLM, no embedding</i>]
L & C & G & D & A & R --> O[SearchResult]
style Q fill:#f0f9ff,color:#000
style O fill:#e6ffe6,color:#000
Installation
git clone git@github.com:CAMARA-CHILENA-INTELIGENCIA-ARTIFICIAL/GRAIL.git
cd GRAIL
# Recommended: uv (https://github.com/astral-sh/uv)
uv venv --python 3.12
uv pip install -e ".[dev]" # add ",s3" for S3, ",ui" for the web chat
# Or plain pip
python3.12 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
Set the API keys for whichever endpoint(s) you use:
cp .env.example .env
# edit .env — OPENAI_API_KEY, DEEPINFRA_API_KEY, ANTHROPIC_API_KEY, …
Built-in endpoints cover openai, anthropic, deepinfra, together, groq, openrouter, ollama, vllm, sglang, lmstudio, local. Add your own:
# endpoints.yaml
endpoints:
my-vllm:
base_url: http://my-vllm.local:8000/v1
api_key_env: MY_VLLM_KEY
requires_key: false
Quickstart
Knowledge-base mode
# 1. Scaffold a project (low_cost_setup pre-tunes DeepInfra-friendly defaults)
uv run grail init ./my-kb --name my-kb --template low_cost_setup
# 2. Drop .txt / .md / .pdf / .docx / code files into ./my-kb/input/
# 3. Index
uv run grail index ./my-kb
📸 Place
assets/indexing.pnghere — terminal output from the four-step indexing run.
# 4. Query — six modes
uv run grail query ./my-kb "What are the main themes?" --mode global
uv run grail query ./my-kb "Who is Alice?" --mode local
uv run grail query ./my-kb "What dosage does protocol X cite?" --mode cascade
uv run grail query ./my-kb "Summarise law-21250.pdf" --mode document -d law-21250.pdf
uv run grail query ./my-kb "Compare early vs advanced treatment." --mode agent
# 5. Chat with it
uv run grail chat ./my-kb # Textual TUI
uv run grail ui ./my-kb # FastAPI + React, http://127.0.0.1:8765
📸 Place
assets/query.pnghere — formatted answer with cited context. 📸 Placeassets/chat_ui.pnghere — web chat with streaming responses.
Agentic memory mode
# 1. Scaffold a memory project
uv run grail init ./my-mem --memory
# 2. Drive it from your agent (Python SDK example below). The agent writes
# markdown observations under ./my-mem/memories/<category>/ and entities
# are merged into the parquet on every write.
# 3. Recall — pure pandas filter, no LLM
uv run grail query ./my-mem --mode recall --since 7d --tag "decision"
uv run grail query ./my-mem --mode recall --category "work/clients/acme/**"
# 4. Cascade with a memory filter — combine semantic search and structural scope
uv run grail query ./my-mem "What did we decide about the migration?" \
--mode cascade --since 30d --tag "decision"
# 5. Consolidate — generate proposals (merge aliases, discover communities, …)
uv run grail consolidate ./my-mem
uv run grail proposals list ./my-mem
uv run grail proposals apply ./my-mem --accept <proposal_id>
📸 Place
assets/memory_consolidate.pnghere — proposals listing output.
Python SDK
GRAIL is library-first. The CLI is just a wrapper around the same Python API.
Knowledge-base mode
import asyncio
from grail import GRAIL, load_config
async def main():
grail = GRAIL.from_config(load_config("./my-kb"))
# Full pipeline.
await grail.index()
# Six search modes return the same SearchResult shape.
answer = await grail.search("What are the main themes?", mode="global")
print(answer.response)
answer = await grail.agent_search("Compare early vs advanced treatment.")
print(answer.response)
# Incremental updates touch only affected text units.
await grail.append(["new_doc.pdf"])
await grail.edit({"existing.md": "/path/to/updated.md"})
await grail.delete(["obsolete.txt"])
# Honest cost ledger.
print(grail.cost_tracker.render_total_cost())
asyncio.run(main())
Agentic memory mode
import asyncio
from grail import MemoryProject
async def main():
mp = MemoryProject("./my-mem")
# The agent writes an observation with explicit entities + relationships.
# No LLM extraction — the caller already knows what they meant.
reply = await mp.add_observation(
title="Acme picked Postgres over DynamoDB",
content="In the architecture review on Tuesday, Acme committed to "
"Postgres for the order-history service because of "
"transactional needs across the inventory + payments tables.",
category="work/clients/acme",
tags=["decision", "architecture"],
entities=[
{"name": "Acme", "type": "ORGANIZATION", "description": "Client"},
{"name": "Postgres", "type": "TECHNOLOGY", "description": "Chosen DB"},
{"name": "DynamoDB", "type": "TECHNOLOGY", "description": "Rejected alternative"},
],
relationships=[
{"source": "Acme", "target": "Postgres", "relationship_type": "CHOSE",
"description": "for transactional order-history"},
{"source": "Acme", "target": "DynamoDB", "relationship_type": "REJECTED",
"description": "lacked cross-table transactions"},
],
confidence=0.95,
)
print(reply.ok, reply.data["observation_id"])
# Recall — no LLM, structural filter only.
recall = await mp.recall(
mode="recall",
category="work/clients/acme/**",
tags=["decision"],
since="30d",
)
for obs in recall.data["observations"]:
print(obs["observed_at"], obs["title"])
# Cascade with the same filter, so semantic search is scoped to recent
# Acme decisions.
answer = await mp.recall(
"Why did Acme rule out DynamoDB?",
mode="cascade",
category="work/clients/acme/**",
since="30d",
)
print(answer.data["response"])
# Consolidate — runs proposal analyses; the agent reviews each item.
proposals = mp.consolidate()
for p in proposals.data.get("proposals", []):
print(p["kind"], p["confidence"], p["rationale"])
asyncio.run(main())
Every MemoryProject method returns a Reply(ok, data, warnings, next_steps, error) envelope — the same shape the agent-skill scripts emit, so SDK callers and tool-calling agents read the same keys.
The memory SDK is designed to be wrapped by agent skills (SKILL.md folder format) that run inside Claude Code, OpenAI Codex, OpenCode, or any other framework that supports the standard skill convention. Discovery paths differ per framework; the skill body itself is portable.
See docs/python_api.md, the runnable examples/quickstart/quickstart.py, and the canonical embedding reference at grail/apps/chat/server.py.
Agent skill
GRAIL ships as a portable agent skill under skills/grail/ that lets a coding agent drive a knowledge base or maintain its own memory directly through tool calls — no Python harness on the agent side.
The skill follows the standard SKILL.md folder convention shared across Claude Code, OpenAI Codex, and other frameworks that adopt the same skill format, so the same folder works everywhere with framework-specific install paths.
What's inside
skills/grail/
├── SKILL.md ← skill description + agent-facing trigger language
├── INSTALL.md ← per-framework install paths
├── requirements.txt
├── scripts/ ← one CLI script per primitive operation
│ ├── setup.sh ← idempotent: installs grail on first call
│ ├── init_project.py
│ ├── list_grail_projects.py
│ ├── status.py
│ ├── index.py · append.py · edit.py · delete.py · explore.py · query.py · env_check.py
│ └── memory/ ← memory-mode tools
│ ├── add_observation.py
│ ├── add_entity.py · add_relationship.py · add_community.py
│ ├── find_similar_entity.py
│ ├── recall.py
│ ├── consolidate.py · list_proposals.py · apply_proposal.py
├── agents/openai.yaml ← Codex sidecar (additive metadata)
├── references/ ← long-form context the agent reads on demand
│ ├── kb_mode.md · memory_mode.md · memory_tools.md
│ ├── proposals.md · search_modes.md · query_optimization.md
│ └── config_reference.md · troubleshooting.md
└── assets/
Install
# Claude Code — user scope (symlink so `git pull` updates the skill)
mkdir -p ~/.claude/skills
ln -s "$(pwd)/skills/grail" ~/.claude/skills/grail
# OpenAI Codex — user scope
mkdir -p ~/.agents/skills
ln -s "$(pwd)/skills/grail" ~/.agents/skills/grail
# Project scope — bundle the skill inside a repo where you want it auto-discovered
mkdir -p .claude/skills
ln -s "$(realpath skills/grail)" .claude/skills/grail
| Framework | User-scope install | Project-scope install |
|---|---|---|
| Claude Code (CLI + claude.ai) | ~/.claude/skills/grail/ |
<repo>/.claude/skills/grail/ |
| OpenAI Codex | ~/.agents/skills/grail/ |
<repo>/.agents/skills/grail/ |
Other frameworks supporting SKILL.md |
framework-specific skills directory | same |
The skill auto-installs GRAIL via scripts/setup.sh on first call. It is idempotent — safe to invoke every session. See skills/grail/INSTALL.md for full details, including how to verify the install (scripts/env_check.py).
How an agent uses it
Every primitive is a single CLI script that returns a JSON Reply envelope ({ok, data, warnings, next_steps, error}) — the same shape the Python SDK uses, so the agent reads identical keys regardless of how it invokes GRAIL.
A typical memory-mode session inside the agent:
python scripts/list_grail_projects.py
# → which projects exist, their mode (knowledge_base or memory)
python scripts/status.py --project my-mem
# → mode, artefact counts, last_indexed_at; the agent routes on `mode`
python scripts/memory/add_observation.py \
--project my-mem \
--title "Acme picked Postgres over DynamoDB" \
--content "..." \
--category "work/clients/acme" \
--tag decision --tag architecture \
--entities '[{"name":"Acme","type":"ORGANIZATION"},{"name":"Postgres","type":"TECHNOLOGY"}]' \
--relationships '[{"source":"Acme","target":"Postgres","relationship_type":"CHOSE"}]'
python scripts/memory/recall.py --project my-mem --since 7d --tag decision
python scripts/memory/consolidate.py --project my-mem
python scripts/memory/list_proposals.py --project my-mem
python scripts/memory/apply_proposal.py --project my-mem --accept <proposal_id>
The skill body (the SKILL.md description, references, and scripts) is fully portable across frameworks; only the discovery path differs per framework. See skills/grail/SKILL.md for the agent-facing trigger language and routing logic.
CLI reference
All commands take a project directory as the first positional argument.
grail init — scaffold a project
grail init <project_dir> [--name NAME] [--memory]
[--template TEMPLATE] [--templates-dir DIR]
[--overwrite] [--git/--no-git] [--list-templates]
| Flag | Description |
|---|---|
--memory |
Scaffold a memory-mode project (memories/ folder, agent-driven write path). Default is knowledge_base mode (input/ folder, grail index workflow). |
--name |
Project name (defaults to directory name). |
--template, -t |
Template pack (e.g. low_cost_setup) or your own under --templates-dir. Mutually exclusive with --memory. |
--templates-dir |
Extra directory to search for templates. |
--overwrite |
Overwrite an existing scaffold. |
--git/--no-git |
Initialise a git repo. Default: on for memory mode, off for KB mode. |
--list-templates |
List available templates and exit. |
grail index — full pipeline (KB mode)
grail index <project_dir> [--discover-entities/--no-discover-entities]
[--vectorstore lancedb|faiss|chromadb]
grail query — answer a question
grail query <project_dir> "<question>" [--mode MODE] [--document NAME]
[--rerank/--no-rerank] [--trace DIR]
[--output text|json]
[--vectorstore lancedb|faiss|chromadb]
# memory recall filters:
[--since DELTA] [--before DELTA]
[--category GLOB] [--tag TAG ...]
[--entity-name NAME ...] [--type TYPE ...]
[--min-confidence FLOAT]
| Flag | Description |
|---|---|
--mode, -m |
local (default) · cascade · global · document · agent · recall. |
--document, -d |
Required for --mode document. |
--rerank/--no-rerank |
Override the reranker config for this query. |
--trace, -t |
Dump full prompts, responses, and context to JSON. |
--since / --before |
Restrict to observations newer/older than ISO-8601 timestamp or relative (1h, 7d). |
--category |
Folder-glob filter (e.g. work/clients/**). |
--tag |
Tag filter; repeat for any-match. |
--entity-name, --type |
Restrict candidate pool to specific entities or types. |
--min-confidence |
Drop entities / text units below this confidence. |
grail append / edit / delete — incremental KB updates
grail append <project_dir> file1 [file2 ...]
grail edit <project_dir> --name FILENAME --src /path/to/new
grail delete <project_dir> filename1 [filename2 ...]
grail consolidate — generate memory proposals
Runs Leiden + edge-density + alias-detection + folder-split analyses and writes a proposal set. Pure read pass — nothing mutates until the agent reviews.
grail consolidate <project_dir> [--output text|json]
grail proposals list / grail proposals apply
grail proposals list <project_dir> [--status pending|accepted|rejected]
grail proposals apply <project_dir> [--accept ID] [--reject ID]
grail create-entities — LLM entity-type discovery (KB mode)
grail create-entities <project_dir> [--write]
grail chat / grail ui — interactive interfaces
grail chat <project_dir> [--mode agent|local|cascade|global|document]
[--session ID] [--db PATH]
grail ui <project_dir> [--host HOST] [--port PORT] [--dev] [--debug]
grail viz / grail explore / grail export-neo4j
grail viz <project_dir> [--output FILE.html] [--open-browser]
[--layout-seed N] [--layout-iterations N]
grail explore <project_dir> [--output text|json]
grail export-neo4j <project_dir> [--uri URI] [--username USER] [--password PW]
[--database DB] [--clear] [--no-apoc]
[--batch-size N]
grail status / grail config show / grail prompt list / grail prompt show
Introspection commands for artefacts, resolved config, and the prompt registry.
Search modes
| Mode | Best for | How it works | LLM? |
|---|---|---|---|
local |
Named concepts, entities, "who/what is X" | Embed query → top-k similar entities → assemble linked text units, relationships, reports → answer. | Yes |
cascade |
Factual questions, specific details | Entity-gate + BM25/cosine text rescue → combined ranking → answer. Solves the classic GraphRAG fact-retrieval gap. | Yes |
global |
Broad / thematic questions | Map-reduce over community reports; hierarchical reduce above 100K tokens. | Yes |
document |
Single-file questions | Scope retrieval to one source document. | Yes |
agent |
Multi-step / comparative questions | LLM picks 1–3 of the above per question, with forced synthesis if iterations run out. | Yes |
recall |
Memory replay, "what did we say last week about X" | Pure pandas filter over observed_at / category / tag / entity / confidence. Zero LLM cost. Combine with any other mode as a filter modifier. |
No |
Query-shape tip for local / cascade: structure as [WHO does it] + [WHAT is the process] + [SPECIFIC TERMS from entity descriptions]. This matches entity embeddings ~3× better than keyword-only queries.
Full details: docs/search_modes.md.
Storage & vector backends
Both layers are pluggable. Pick the defaults to ship today and swap when the deployment changes.
Vector stores
| Backend | Default? | Distance | Best for | Config |
|---|---|---|---|---|
| FAISS | ✓ | cosine (IndexFlatIP on L2-normalised vectors) |
In-memory speed; ships in the wheel; no separate service; ideal for projects up to ~1M vectors | vectorstore.backend: faiss |
| LanceDB | cosine / L2 | On-disk columnar; lazy-loaded; good for >1M vectors and multi-process readers | vectorstore.backend: lancedb |
|
| ChromaDB | cosine / L2 | Long-running services; built-in metadata filtering; existing Chroma deployments | vectorstore.backend: chromadb |
Override per-run from the CLI without editing config: grail index --vectorstore faiss|lancedb|chromadb and the same flag on grail query. Custom backends plug in by subclassing BaseVectorStore in grail/vectorstores/base.py.
Details: docs/vectorstores.md.
Storage backends
| Backend | Best for | Config |
|---|---|---|
| Local filesystem | Default · single machine · tests · embedded CLI | storage.backend: local, storage.root: ... |
| S3 (and S3-compatible) | Production · multi-machine · MinIO · Cloudflare R2 · any S3 API | storage.backend: s3 + s3_bucket, s3_prefix, s3_region, s3_endpoint_url |
S3 reads/writes the same parquet + GraphML artefacts — the search code is backend-agnostic. Install with the s3 extra: uv pip install -e ".[s3]". Custom backends implement the seven required methods on StorageBackend in grail/storage/base.py.
Details: docs/storage.md.
LLM endpoints
Built-in: openai, anthropic, deepinfra, together, groq, openrouter, ollama, vllm, sglang, lmstudio, local. Endpoint (base URL + key env var) and model are separate fields, so changing providers is a one-line config edit. Add your own by appending to endpoints.yaml:
endpoints:
my-vllm:
base_url: http://my-vllm.local:8000/v1
api_key_env: MY_VLLM_KEY
requires_key: false
Details: docs/llm.md.
Benchmarks
Internal: Chilean Oncology Laws (benchmark_laws)
30 patient-language questions across 7 categories against three Chilean oncology laws (~58 pages, 35 chunks). Designed as a worst-case scenario for graph-enhanced retrieval — a corpus small enough that vanilla RAG has no retrieval failures.
| Metric | GRAIL Agent | RAG Agent |
|---|---|---|
| Average score (5-dim rubric) | 4.80 / 5.00 | 4.14 / 5.00 |
| Win – Loss – Tie | 27 – 0 – 3 | 0 – 27 – 3 |
| Avg response time | ~25 s | ~35 s |
| Avg LLM calls per question | 2.6 | 3.1 |
| Empty responses | 0 | 0 |
Where the advantage comes from:
| Category | Δ score | Why |
|---|---|---|
| Cross-Source (answer spans documents) | +0.80 | Graph links across documents. |
| Comparative | +0.80 | Agent runs two local_search calls in parallel. |
| Global Synthesis | +0.90 | Community reports give bird's-eye context. |
| Multi-Chunk | +0.72 | Entity graph connects chunks. |
| Procedural | +0.64 | Relationship chains preserve step order. |
| Single Fact | +0.52 | Entity descriptions answer directly. |
| Negation / Boundary | +0.08 | Near tie. |
Methodology: Qwen3.6-35B-A3B (DeepInfra) for inference, Qwen3-Embedding-8B for retrieval, FAISS cosine, 3-iteration agent budget, Claude Opus 4.6 judge with a 5-dimension rubric (Correctness 35%, Completeness 25%, Source Grounding 15%, Coherence 10%, No Hallucination 15%).
uv run python benchmarks/run_benchmark.py
📸 Place
assets/benchmark_chart.pnghere — bar chart of GRAIL vs RAG by category.
Full report and per-question detail: benchmarks/results/.
External (roadmap)
- GraphRAG-Bench — 4,072 questions, Medical + Novel domains, 4 difficulty levels.
- LongMemEval — 500 questions on chat-session memory; tracked under memory mode.
See docs/benchmarks.md.
Documentation
| Topic | File |
|---|---|
| Getting started | docs/getting-started.md |
| Python API | docs/python_api.md |
| Config glossary | docs/glossary.md |
| Search modes | docs/search_modes.md |
| Indexing pipeline | docs/indexing.md |
| Incremental updates | docs/incremental_pipeline.md |
| Query layer | docs/query.md |
| Prompt customisation | docs/prompt_customization.md |
| Prompt internals | docs/prompts.md |
| LLM & cost tracking | docs/llm.md |
| Reranker | docs/reranker.md |
| Vector stores | docs/vectorstores.md |
| Storage backends | docs/storage.md |
| Preprocessing (PDF/DOCX) | docs/preprocessing.md |
| Visualisation | docs/viz.md |
| Chat UIs | docs/cli_chat.md |
| Benchmarks | docs/benchmarks.md |
| Comparison vs alternatives | docs/comparison.md |
A Docusaurus site is in progress; this README and the files above are the current source of truth.
Acknowledgements
- The single-pass LLM extraction of entities and relationships from text chunks in GRAIL's knowledge-base mode draws inspiration from Microsoft GraphRAG.
- Nirvai, partner of CCHIA for the code base and testing results on agentic memory.
GRAIL is developed under the open-source commission of the Cámara Chilena de Inteligencia Artificial (Chilean Chamber of Artificial Intelligence).
Author
Benjamín González Guerrero — Founder at Nirvai (Nirvana). Contact: ben@nirvana-ai.com
GRAIL is shipped under the Nirvai (Nirvana) umbrella.
License
MIT © 2025 Cámara Chilena de Inteligencia Artificial.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphgrail-0.1.1.tar.gz.
File metadata
- Download URL: graphgrail-0.1.1.tar.gz
- Upload date:
- Size: 615.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15659877e0cd1ad978bb0bbd0013c4d9e20758b80dfab7fa075f076a2f466b9a
|
|
| MD5 |
068f0c0c7bb31644e0378fa206c49b56
|
|
| BLAKE2b-256 |
d1842a4c9b0638c9815abf3a34acf11e5afc4293f4d7afd3d47369a5f7130048
|
File details
Details for the file graphgrail-0.1.1-py3-none-any.whl.
File metadata
- Download URL: graphgrail-0.1.1-py3-none-any.whl
- Upload date:
- Size: 539.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae0d0ea48b19b01cdcb1ca2c3b4e342d80ec6aa4bc7964e0ce2926237f7c7d13
|
|
| MD5 |
d8064334c3fab4ef04139304f059f066
|
|
| BLAKE2b-256 |
919a32221f77bb24f33b8a9e5074c2efbf6ff981b2e7e8b9a2faf849779ceb60
|