Skip to main content

Helix: Temporal GraphRAG combining LightRAG and Graphiti for time-aware knowledge graphs

Project description

๐Ÿงฌ Helix: Temporal GraphRAG

LightRAG + Graphiti = Temporal Knowledge Graphs for RAG


๐ŸŽฏ What is Helix?

Helix fuses LightRAG's proven dual-level retrieval with Graphiti's bi-temporal Knowledge Graph to create a next-generation RAG system with:

Feature Capability
Temporal Awareness Point-in-time queries, automatic edge invalidation
Multi-Hop Reasoning BFS-based path exploration with scoring
Hallucination Detection Composite Fidelity Index (CFI) verification
Incremental Updates No full graph rebuild required

๐Ÿ“Š Benchmark Targets

Category Datasets Metrics Target Baseline
Temporal TSQA, Time-LongQA, ECT-QA, MultiTQ Hit@1, Hit@5, Acc 70-75% 45-55%
Hallucination Legal QA, Medical QA, FEVER AUC, CFI >0.95 0.84-0.94
Multi-Hop MuSiQue, 2WikiMHQA, HotpotQA F1, EM 70-75 54-59
Scalability UltraDomain (all) Tokens, Latency <600K 14M

๐Ÿ“ฆ Installation

From PyPI

pip install helix-rag

From Source (Development)

git clone https://github.com/YashNuhash/Helix.git
cd Helix

# Install with Helix dependencies
pip install -e ".[helix]"

Dependencies

Helix requires:

  • Neo4j (for Graphiti Knowledge Graph)
  • Supabase (optional, for vector storage)
  • LLM API (any provider - configured via environment)

โš™๏ธ Configuration

Copy .env.example to .env and configure:

cp .env.example .env

Required Environment Variables

# Neo4j Configuration (for Graphiti)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

# LLM Configuration (model-agnostic)
LLM_MODEL_NAME=your_model_name
LLM_API_KEY=your_api_key

# Supabase (optional)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_key

Supabase Setup (Optional)

Run scripts/supabase_schema.sql in your Supabase SQL Editor to create the vector storage table.


๐Ÿš€ Quick Start

Basic Usage

import asyncio
from helix import Helix

async def main():
    # Initialize Helix
    async with Helix() as helix:
        # Insert document with temporal tracking
        result = await helix.insert(
            "Alan Turing was born on June 23, 1912. "
            "He is considered the father of computer science.",
            source_description="Wikipedia"
        )
        print(f"Extracted {result['entities_extracted']} entities")
        
        # Query with temporal awareness
        answer = await helix.query(
            "When was Alan Turing born?",
            mode="hybrid"
        )
        print(answer["answer"])

asyncio.run(main())

Temporal Queries

from datetime import datetime
from helix import Helix
from helix.utils import is_temporal_query, extract_temporal_params

async def temporal_example():
    async with Helix() as helix:
        # Detect temporal intent
        query = "What was the CEO of Apple in 2015?"
        
        if is_temporal_query(query):
            params = extract_temporal_params(query)
            print(f"Temporal query detected: {params.temporal_keywords}")
        
        # Query with point-in-time context
        result = await helix.query(
            query,
            valid_at=datetime(2015, 1, 1),
            include_temporal_context=True
        )
        print(result)

asyncio.run(temporal_example())

Hallucination Detection

from helix.hallucination import HallucinationDetector

async def verify_response():
    async with Helix() as helix:
        detector = HallucinationDetector(graphiti=helix.graphiti)
        
        # Get response
        result = await helix.query("Tell me about Alan Turing")
        
        # Verify against knowledge graph
        verification = await detector.verify_response(
            response=result["answer"],
            query="Tell me about Alan Turing",
            context=result.get("temporal_context")
        )
        
        print(f"Grounded: {verification.is_grounded}")
        print(f"CFI Score: {verification.confidence_score:.2f}")
        print(f"Entity Coverage: {verification.entity_coverage:.2%}")

asyncio.run(verify_response())

Multi-Hop Reasoning

from helix.multihop import MultiHopRetriever

async def multihop_example():
    async with Helix() as helix:
        retriever = MultiHopRetriever(graphiti=helix.graphiti)
        
        # Find reasoning paths
        paths = await retriever.find_paths(
            query="How is Alan Turing connected to modern AI?",
            max_hops=3
        )
        
        # Format as context
        context = retriever.format_paths_as_context(paths)
        print(context)

asyncio.run(multihop_example())

๐Ÿ“ˆ Evaluation

Running Benchmarks

Helix includes evaluation scripts for academic benchmarks. Use these in Google Colab or Kaggle:

# Install Helix
!pip install helix-rag

# Run temporal benchmark
from helix.eval import TemporalBenchmark

benchmark = TemporalBenchmark(dataset="time-longqa")
results = await benchmark.run()
print(f"Hit@1: {results['hit_at_1']:.2%}")

Supported Benchmarks

Benchmark Dataset Command
Temporal TSQA helix eval --dataset tsqa
Temporal Time-LongQA helix eval --dataset time-longqa
Temporal ECT-QA helix eval --dataset ect-qa
Multi-Hop MuSiQue helix eval --dataset musique
Multi-Hop HotpotQA helix eval --dataset hotpotqa
Hallucination FEVER helix eval --dataset fever
Scalability UltraDomain helix eval --dataset ultradomain

Colab/Kaggle Notebook

# Quick evaluation notebook
import os
os.environ["LLM_API_KEY"] = "your_key"
os.environ["LLM_MODEL_NAME"] = "your_model"
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_PASSWORD"] = "password"

from helix import Helix
from helix.eval import run_all_benchmarks

# Run all benchmarks
results = await run_all_benchmarks()
print(results.to_dataframe())

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         Helix                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚   LightRAG  โ”‚  โ”‚   Graphiti   โ”‚  โ”‚  Helix Modules    โ”‚   โ”‚
โ”‚  โ”‚  (Retrieval)โ”‚  โ”‚ (Temporal KG)โ”‚  โ”‚                   โ”‚   โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค   โ”‚
โ”‚  โ”‚ - Chunking  โ”‚  โ”‚ - Episodes   โ”‚  โ”‚ - TemporalHandler โ”‚   โ”‚
โ”‚  โ”‚ - Embedding โ”‚  โ”‚ - Bi-temporalโ”‚  โ”‚ - Hallucination   โ”‚   โ”‚
โ”‚  โ”‚ - Vector DB โ”‚  โ”‚ - Resolution โ”‚  โ”‚ - MultiHop        โ”‚   โ”‚
โ”‚  โ”‚ - Dual-levelโ”‚  โ”‚ - Invalidate โ”‚  โ”‚ - CFI Scoring     โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚         โ”‚                โ”‚                    โ”‚              โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                          โ–ผ                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    Storage Layer                     โ”‚    โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค    โ”‚
โ”‚  โ”‚  Neo4j (Graph)  โ”‚  Supabase (Vector)  โ”‚  Local KV   โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ Project Structure

helix/
โ”œโ”€โ”€ __init__.py           # Package entry (v0.1.1)
โ”œโ”€โ”€ core/
โ”‚   โ””โ”€โ”€ helix.py          # Main Helix class
โ”œโ”€โ”€ storage/
โ”‚   โ”œโ”€โ”€ graphiti_impl.py  # GraphitiGraphStorage
โ”‚   โ””โ”€โ”€ supabase_impl.py  # SupabaseVectorStorage
โ”œโ”€โ”€ temporal/
โ”‚   โ””โ”€โ”€ query_handler.py  # TemporalQueryHandler
โ”œโ”€โ”€ hallucination/
โ”‚   โ””โ”€โ”€ detector.py       # HallucinationDetector (CFI)
โ”œโ”€โ”€ multihop/
โ”‚   โ””โ”€โ”€ retriever.py      # MultiHopRetriever (BFS)
โ””โ”€โ”€ utils/
    โ””โ”€โ”€ temporal_utils.py # Temporal parsing

๐Ÿ”ฌ Research Goals

Helix is designed to achieve state-of-the-art performance on:

  1. Temporal GraphRAG: 70-75% accuracy on temporal QA benchmarks
  2. Hallucination Detection: AUC >0.95 using graph-aligned verification
  3. Multi-Hop Reasoning: F1 70-75 on complex reasoning benchmarks
  4. Scalability: <600K tokens for indexing (vs 14M baseline)

See PLAN.md for detailed research methodology.


๐Ÿ“š Citation

If you use Helix in your research, please cite:

@software{helix2024,
  title = {Helix: Temporal GraphRAG with LightRAG and Graphiti},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/YashNuhash/Helix}
}

๐Ÿค Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


๐Ÿ“„ License

MIT License - see LICENSE for details.


Built with ๐Ÿงฌ Helix

LightRAG + Graphiti = Temporal GraphRAG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_rag-0.1.4.tar.gz (549.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helix_rag-0.1.4-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file helix_rag-0.1.4.tar.gz.

File metadata

  • Download URL: helix_rag-0.1.4.tar.gz
  • Upload date:
  • Size: 549.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for helix_rag-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a3e24043a8206f72443b312a3dd333f96f861ff9365a6b415d08cd6229a96725
MD5 19ece263976a2fe23913522d778cc1c8
BLAKE2b-256 b2c41bb39bfe173590ebd77d8b568cb1d456c0ca837a780981a3dec4be5b5c2a

See more details on using hashes here.

File details

Details for the file helix_rag-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: helix_rag-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for helix_rag-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 75c7a150da2f41b394998ea7930523829ecb810189ee5c1c32d2e0835c7ccfba
MD5 ff520cca0902fe93d56b82c999eb2bec
BLAKE2b-256 56b8d6a8e30017ebf34885d8dcf418cb2fa01fd47e579eeff6816637ce3514cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page