Helix: Temporal GraphRAG combining LightRAG and Graphiti for time-aware knowledge graphs
Project description
๐ฏ What is Helix?
Helix fuses LightRAG's proven dual-level retrieval with Graphiti's bi-temporal Knowledge Graph to create a next-generation RAG system with:
| Feature | Capability |
|---|---|
| Temporal Awareness | Point-in-time queries, automatic edge invalidation |
| Multi-Hop Reasoning | BFS-based path exploration with scoring |
| Hallucination Detection | Composite Fidelity Index (CFI) verification |
| Incremental Updates | No full graph rebuild required |
๐ Benchmark Targets
| Category | Datasets | Metrics | Target | Baseline |
|---|---|---|---|---|
| Temporal | TSQA, Time-LongQA, ECT-QA, MultiTQ | Hit@1, Hit@5, Acc | 70-75% | 45-55% |
| Hallucination | Legal QA, Medical QA, FEVER | AUC, CFI | >0.95 | 0.84-0.94 |
| Multi-Hop | MuSiQue, 2WikiMHQA, HotpotQA | F1, EM | 70-75 | 54-59 |
| Scalability | UltraDomain (all) | Tokens, Latency | <600K | 14M |
๐ฆ Installation
From PyPI
pip install helix-rag
From Source (Development)
git clone https://github.com/YashNuhash/Helix.git
cd Helix
# Install with Helix dependencies
pip install -e ".[helix]"
Dependencies
Helix requires:
- Neo4j (for Graphiti Knowledge Graph)
- Supabase (optional, for vector storage)
- LLM API (any provider - configured via environment)
โ๏ธ Configuration
Copy .env.example to .env and configure:
cp .env.example .env
Required Environment Variables
# Neo4j Configuration (for Graphiti)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
# LLM Configuration (model-agnostic)
LLM_MODEL_NAME=your_model_name
LLM_API_KEY=your_api_key
# Supabase (optional)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_key
Supabase Setup (Optional)
Run scripts/supabase_schema.sql in your Supabase SQL Editor to create the vector storage table.
๐ Quick Start
Basic Usage
import asyncio
from helix import Helix
async def main():
# Initialize Helix
async with Helix() as helix:
# Insert document with temporal tracking
result = await helix.insert(
"Alan Turing was born on June 23, 1912. "
"He is considered the father of computer science.",
source_description="Wikipedia"
)
print(f"Extracted {result['entities_extracted']} entities")
# Query with temporal awareness
answer = await helix.query(
"When was Alan Turing born?",
mode="hybrid"
)
print(answer["answer"])
asyncio.run(main())
Temporal Queries
from datetime import datetime
from helix import Helix
from helix.utils import is_temporal_query, extract_temporal_params
async def temporal_example():
async with Helix() as helix:
# Detect temporal intent
query = "What was the CEO of Apple in 2015?"
if is_temporal_query(query):
params = extract_temporal_params(query)
print(f"Temporal query detected: {params.temporal_keywords}")
# Query with point-in-time context
result = await helix.query(
query,
valid_at=datetime(2015, 1, 1),
include_temporal_context=True
)
print(result)
asyncio.run(temporal_example())
Hallucination Detection
from helix.hallucination import HallucinationDetector
async def verify_response():
async with Helix() as helix:
detector = HallucinationDetector(graphiti=helix.graphiti)
# Get response
result = await helix.query("Tell me about Alan Turing")
# Verify against knowledge graph
verification = await detector.verify_response(
response=result["answer"],
query="Tell me about Alan Turing",
context=result.get("temporal_context")
)
print(f"Grounded: {verification.is_grounded}")
print(f"CFI Score: {verification.confidence_score:.2f}")
print(f"Entity Coverage: {verification.entity_coverage:.2%}")
asyncio.run(verify_response())
Multi-Hop Reasoning
from helix.multihop import MultiHopRetriever
async def multihop_example():
async with Helix() as helix:
retriever = MultiHopRetriever(graphiti=helix.graphiti)
# Find reasoning paths
paths = await retriever.find_paths(
query="How is Alan Turing connected to modern AI?",
max_hops=3
)
# Format as context
context = retriever.format_paths_as_context(paths)
print(context)
asyncio.run(multihop_example())
๐ Evaluation
Running Benchmarks
Helix includes evaluation scripts for academic benchmarks. Use these in Google Colab or Kaggle:
# Install Helix
!pip install helix-rag
# Run temporal benchmark
from helix.eval import TemporalBenchmark
benchmark = TemporalBenchmark(dataset="time-longqa")
results = await benchmark.run()
print(f"Hit@1: {results['hit_at_1']:.2%}")
Supported Benchmarks
| Benchmark | Dataset | Command |
|---|---|---|
| Temporal | TSQA | helix eval --dataset tsqa |
| Temporal | Time-LongQA | helix eval --dataset time-longqa |
| Temporal | ECT-QA | helix eval --dataset ect-qa |
| Multi-Hop | MuSiQue | helix eval --dataset musique |
| Multi-Hop | HotpotQA | helix eval --dataset hotpotqa |
| Hallucination | FEVER | helix eval --dataset fever |
| Scalability | UltraDomain | helix eval --dataset ultradomain |
Colab/Kaggle Notebook
# Quick evaluation notebook
import os
os.environ["LLM_API_KEY"] = "your_key"
os.environ["LLM_MODEL_NAME"] = "your_model"
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_PASSWORD"] = "password"
from helix import Helix
from helix.eval import run_all_benchmarks
# Run all benchmarks
results = await run_all_benchmarks()
print(results.to_dataframe())
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Helix โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโ โ
โ โ LightRAG โ โ Graphiti โ โ Helix Modules โ โ
โ โ (Retrieval)โ โ (Temporal KG)โ โ โ โ
โ โโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโโโโค โ
โ โ - Chunking โ โ - Episodes โ โ - TemporalHandler โ โ
โ โ - Embedding โ โ - Bi-temporalโ โ - Hallucination โ โ
โ โ - Vector DB โ โ - Resolution โ โ - MultiHop โ โ
โ โ - Dual-levelโ โ - Invalidate โ โ - CFI Scoring โ โ
โ โโโโโโโโฌโโโโโโโ โโโโโโโโฌโโโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Storage Layer โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Neo4j (Graph) โ Supabase (Vector) โ Local KV โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Project Structure
helix/
โโโ __init__.py # Package entry (v0.1.1)
โโโ core/
โ โโโ helix.py # Main Helix class
โโโ storage/
โ โโโ graphiti_impl.py # GraphitiStorage
โ โโโ supabase_impl.py # SupabaseVectorStorage
โโโ temporal/
โ โโโ query_handler.py # TemporalQueryHandler
โโโ hallucination/
โ โโโ detector.py # HallucinationDetector (CFI)
โโโ multihop/
โ โโโ retriever.py # MultiHopRetriever (BFS)
โโโ utils/
โโโ temporal_utils.py # Temporal parsing
๐ฌ Research Goals
Helix is designed to achieve state-of-the-art performance on:
- Temporal GraphRAG: 70-75% accuracy on temporal QA benchmarks
- Hallucination Detection: AUC >0.95 using graph-aligned verification
- Multi-Hop Reasoning: F1 70-75 on complex reasoning benchmarks
- Scalability: <600K tokens for indexing (vs 14M baseline)
See PLAN.md for detailed research methodology.
๐ Citation
If you use Helix in your research, please cite:
@software{helix2024,
title = {Helix: Temporal GraphRAG with LightRAG and Graphiti},
author = {Your Name},
year = {2024},
url = {https://github.com/YashNuhash/Helix}
}
๐ค Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
๐ License
MIT License - see LICENSE for details.
Built with ๐งฌ Helix
LightRAG + Graphiti = Temporal GraphRAG
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file helix_rag-0.1.13.tar.gz.
File metadata
- Download URL: helix_rag-0.1.13.tar.gz
- Upload date:
- Size: 563.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba42658b04dc3aa78277a7bb1182f88f251eb5ce36e0f1795b8256aec9ae4e1e
|
|
| MD5 |
724b618c340dd0ade0d34e4aba055aa2
|
|
| BLAKE2b-256 |
ac9bb0d61f7631b380bf4bf8142a0bba03c147cf1da1ead0a817b9c75f6b088d
|
File details
Details for the file helix_rag-0.1.13-py3-none-any.whl.
File metadata
- Download URL: helix_rag-0.1.13-py3-none-any.whl
- Upload date:
- Size: 50.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57cd1d0ad286a32c7ac9c676e941b90c45f053661674190020a925f67d53001d
|
|
| MD5 |
5d4565c595213584e40d7f908964c231
|
|
| BLAKE2b-256 |
e98696f9e338f4cab8c5d5585d738079bfa58591656197a2a5afcaa3a0a40558
|