Skip to main content

Helix: Temporal GraphRAG combining LightRAG and Graphiti for time-aware knowledge graphs

Project description

๐Ÿงฌ Helix: Temporal GraphRAG

LightRAG + Graphiti = Temporal Knowledge Graphs for RAG


๐ŸŽฏ What is Helix?

Helix fuses LightRAG's proven dual-level retrieval with Graphiti's bi-temporal Knowledge Graph to create a next-generation RAG system with:

Feature Capability
Temporal Awareness Point-in-time queries, automatic edge invalidation
Multi-Hop Reasoning BFS-based path exploration with scoring
Hallucination Detection Composite Fidelity Index (CFI) verification
Incremental Updates No full graph rebuild required

๐Ÿ“Š Benchmark Targets

Category Datasets Metrics Target Baseline
Temporal TSQA, Time-LongQA, ECT-QA, MultiTQ Hit@1, Hit@5, Acc 70-75% 45-55%
Hallucination Legal QA, Medical QA, FEVER AUC, CFI >0.95 0.84-0.94
Multi-Hop MuSiQue, 2WikiMHQA, HotpotQA F1, EM 70-75 54-59
Scalability UltraDomain (all) Tokens, Latency <600K 14M

๐Ÿ“ฆ Installation

From PyPI

pip install helix-rag

From Source (Development)

git clone https://github.com/YashNuhash/Helix.git
cd Helix

# Install with Helix dependencies
pip install -e ".[helix]"

Dependencies

Helix requires:

  • Neo4j (for Graphiti Knowledge Graph)
  • Supabase (optional, for vector storage)
  • LLM API (any provider - configured via environment)

โš™๏ธ Configuration

Copy .env.example to .env and configure:

cp .env.example .env

Required Environment Variables

# Neo4j Configuration (for Graphiti)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

# LLM Configuration (model-agnostic)
LLM_MODEL_NAME=your_model_name
LLM_API_KEY=your_api_key

# Supabase (optional)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_key

Supabase Setup (Optional)

Run scripts/supabase_schema.sql in your Supabase SQL Editor to create the vector storage table.


๐Ÿš€ Quick Start

Basic Usage

import asyncio
from helix import Helix

async def main():
    # Initialize Helix
    async with Helix() as helix:
        # Insert document with temporal tracking
        result = await helix.insert(
            "Alan Turing was born on June 23, 1912. "
            "He is considered the father of computer science.",
            source_description="Wikipedia"
        )
        print(f"Extracted {result['entities_extracted']} entities")
        
        # Query with temporal awareness
        answer = await helix.query(
            "When was Alan Turing born?",
            mode="hybrid"
        )
        print(answer["answer"])

asyncio.run(main())

Temporal Queries

from datetime import datetime
from helix import Helix
from helix.utils import is_temporal_query, extract_temporal_params

async def temporal_example():
    async with Helix() as helix:
        # Detect temporal intent
        query = "What was the CEO of Apple in 2015?"
        
        if is_temporal_query(query):
            params = extract_temporal_params(query)
            print(f"Temporal query detected: {params.temporal_keywords}")
        
        # Query with point-in-time context
        result = await helix.query(
            query,
            valid_at=datetime(2015, 1, 1),
            include_temporal_context=True
        )
        print(result)

asyncio.run(temporal_example())

Hallucination Detection

from helix.hallucination import HallucinationDetector

async def verify_response():
    async with Helix() as helix:
        detector = HallucinationDetector(graphiti=helix.graphiti)
        
        # Get response
        result = await helix.query("Tell me about Alan Turing")
        
        # Verify against knowledge graph
        verification = await detector.verify_response(
            response=result["answer"],
            query="Tell me about Alan Turing",
            context=result.get("temporal_context")
        )
        
        print(f"Grounded: {verification.is_grounded}")
        print(f"CFI Score: {verification.confidence_score:.2f}")
        print(f"Entity Coverage: {verification.entity_coverage:.2%}")

asyncio.run(verify_response())

Multi-Hop Reasoning

from helix.multihop import MultiHopRetriever

async def multihop_example():
    async with Helix() as helix:
        retriever = MultiHopRetriever(graphiti=helix.graphiti)
        
        # Find reasoning paths
        paths = await retriever.find_paths(
            query="How is Alan Turing connected to modern AI?",
            max_hops=3
        )
        
        # Format as context
        context = retriever.format_paths_as_context(paths)
        print(context)

asyncio.run(multihop_example())

๐Ÿ“ˆ Evaluation

Running Benchmarks

Helix includes evaluation scripts for academic benchmarks. Use these in Google Colab or Kaggle:

# Install Helix
!pip install helix-rag

# Run temporal benchmark
from helix.eval import TemporalBenchmark

benchmark = TemporalBenchmark(dataset="time-longqa")
results = await benchmark.run()
print(f"Hit@1: {results['hit_at_1']:.2%}")

Supported Benchmarks

Benchmark Dataset Command
Temporal TSQA helix eval --dataset tsqa
Temporal Time-LongQA helix eval --dataset time-longqa
Temporal ECT-QA helix eval --dataset ect-qa
Multi-Hop MuSiQue helix eval --dataset musique
Multi-Hop HotpotQA helix eval --dataset hotpotqa
Hallucination FEVER helix eval --dataset fever
Scalability UltraDomain helix eval --dataset ultradomain

Colab/Kaggle Notebook

# Quick evaluation notebook
import os
os.environ["LLM_API_KEY"] = "your_key"
os.environ["LLM_MODEL_NAME"] = "your_model"
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_PASSWORD"] = "password"

from helix import Helix
from helix.eval import run_all_benchmarks

# Run all benchmarks
results = await run_all_benchmarks()
print(results.to_dataframe())

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         Helix                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚   LightRAG  โ”‚  โ”‚   Graphiti   โ”‚  โ”‚  Helix Modules    โ”‚   โ”‚
โ”‚  โ”‚  (Retrieval)โ”‚  โ”‚ (Temporal KG)โ”‚  โ”‚                   โ”‚   โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค   โ”‚
โ”‚  โ”‚ - Chunking  โ”‚  โ”‚ - Episodes   โ”‚  โ”‚ - TemporalHandler โ”‚   โ”‚
โ”‚  โ”‚ - Embedding โ”‚  โ”‚ - Bi-temporalโ”‚  โ”‚ - Hallucination   โ”‚   โ”‚
โ”‚  โ”‚ - Vector DB โ”‚  โ”‚ - Resolution โ”‚  โ”‚ - MultiHop        โ”‚   โ”‚
โ”‚  โ”‚ - Dual-levelโ”‚  โ”‚ - Invalidate โ”‚  โ”‚ - CFI Scoring     โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚         โ”‚                โ”‚                    โ”‚              โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                          โ–ผ                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    Storage Layer                     โ”‚    โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค    โ”‚
โ”‚  โ”‚  Neo4j (Graph)  โ”‚  Supabase (Vector)  โ”‚  Local KV   โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ Project Structure

helix/
โ”œโ”€โ”€ __init__.py           # Package entry (v0.1.1)
โ”œโ”€โ”€ core/
โ”‚   โ””โ”€โ”€ helix.py          # Main Helix class
โ”œโ”€โ”€ storage/
โ”‚   โ”œโ”€โ”€ graphiti_impl.py  # GraphitiStorage
โ”‚   โ””โ”€โ”€ supabase_impl.py  # SupabaseVectorStorage
โ”œโ”€โ”€ temporal/
โ”‚   โ””โ”€โ”€ query_handler.py  # TemporalQueryHandler
โ”œโ”€โ”€ hallucination/
โ”‚   โ””โ”€โ”€ detector.py       # HallucinationDetector (CFI)
โ”œโ”€โ”€ multihop/
โ”‚   โ””โ”€โ”€ retriever.py      # MultiHopRetriever (BFS)
โ””โ”€โ”€ utils/
    โ””โ”€โ”€ temporal_utils.py # Temporal parsing

๐Ÿ”ฌ Research Goals

Helix is designed to achieve state-of-the-art performance on:

  1. Temporal GraphRAG: 70-75% accuracy on temporal QA benchmarks
  2. Hallucination Detection: AUC >0.95 using graph-aligned verification
  3. Multi-Hop Reasoning: F1 70-75 on complex reasoning benchmarks
  4. Scalability: <600K tokens for indexing (vs 14M baseline)

See PLAN.md for detailed research methodology.


๐Ÿ“š Citation

If you use Helix in your research, please cite:

@software{helix2024,
  title = {Helix: Temporal GraphRAG with LightRAG and Graphiti},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/YashNuhash/Helix}
}

๐Ÿค Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


๐Ÿ“„ License

MIT License - see LICENSE for details.


Built with ๐Ÿงฌ Helix

LightRAG + Graphiti = Temporal GraphRAG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_rag-0.1.12.tar.gz (563.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helix_rag-0.1.12-py3-none-any.whl (50.8 kB view details)

Uploaded Python 3

File details

Details for the file helix_rag-0.1.12.tar.gz.

File metadata

  • Download URL: helix_rag-0.1.12.tar.gz
  • Upload date:
  • Size: 563.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for helix_rag-0.1.12.tar.gz
Algorithm Hash digest
SHA256 429b4cd0839058be2d03d29472613c2160d3ab2b8da54436b65335408735122c
MD5 446e3b9aadbee493deec08cb8dc99b88
BLAKE2b-256 2c290293f0b4ccff14f8e161725d40b9254286304c142c468c9d945721122215

See more details on using hashes here.

File details

Details for the file helix_rag-0.1.12-py3-none-any.whl.

File metadata

  • Download URL: helix_rag-0.1.12-py3-none-any.whl
  • Upload date:
  • Size: 50.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for helix_rag-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 fadd8985dd91b3cd20ab592a18173a97cf612dc7b7649edbad8e8e569182654d
MD5 1a8ef7828a49b2047791f62928dcd324
BLAKE2b-256 06e10d5a15c0b718d63360572a05f7c4db3ba0139dfebc63551f2e8186da887d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page