Skip to main content

Lightweight RAG evaluation framework with Korean language support, BGE-M3 embeddings, and HCX-005/Gemini LLMs

Project description

RAGTrace Lite

A lightweight RAG (Retrieval-Augmented Generation) evaluation framework with Korean language support

한국어 버전: README_KO.md

PyPI version Python Support License: Apache 2.0

Overview

RAGTrace Lite is a lightweight framework for evaluating RAG system performance. Built on the RAGAS framework and optimized for Korean language environments.

Key Features:

  • Intelligent Metric Selection: Automatically selects 5 or 4 metrics based on ground truth data availability
  • Local BGE-M3 Embeddings: Offline embedding support for air-gapped environments
  • Multi-LLM Support: HCX-005 (Naver CLOVA Studio) and Gemini (Google)
  • Offline Deployment: Complete air-gapped deployment for closed networks
  • Korean Language Optimized: Native Korean language support

Quick Start

Installation from PyPI (Recommended)

# Basic installation
pip install ragtrace-lite

# Full installation (LLM + Embeddings + Enhanced features)
pip install "ragtrace-lite[all]"

# Optional installations
pip install "ragtrace-lite[llm]"        # LLM support only
pip install "ragtrace-lite[embeddings]" # Local embeddings only

Development Installation

# Clone repository and install in development mode
git clone https://github.com/ntts9990/ragtrace-lite.git
cd ragtrace-lite

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .[all]

API Key Configuration

Create a .env file and add your API keys:

CLOVA_STUDIO_API_KEY=nv-your-hcx-api-key
GEMINI_API_KEY=your-gemini-api-key

Run Sample Evaluation

# Run evaluation with BGE-M3 + HCX
ragtrace-lite evaluate data/sample_data.json --llm hcx

# Generate web dashboard
ragtrace-lite dashboard --open

Platform Support

  • Windows 10+ (PowerShell/CMD)
  • macOS 10.15+ (Intel/Apple Silicon)
  • Linux Ubuntu 18.04+
  • Python 3.9, 3.10, 3.11, 3.12

GPU Support: CUDA (Linux), MPS (Apple Silicon), CPU (All platforms)

Detailed Setup Guide: SETUP.md

Offline Deployment

RAGTrace Lite supports complete offline execution in air-gapped environments.

Quick Offline Deployment

# 1. Create deployment package (internet-connected environment)
python scripts/prepare_offline_deployment.py

# 2. Copy generated ZIP file to air-gapped PC
# dist/ragtrace-lite-offline-YYYYMMDD-HHMMSS.zip

# 3. Extract and install in air-gapped environment
scripts/install.bat

# 4. Run evaluation
scripts/run_evaluation.bat

Offline Support Features

  • Python 3.11 Auto-Install: Windows installer included
  • BGE-M3 Local Model: 2.3GB embedding model pre-downloaded
  • All Dependencies Included: Complete offline installation with wheel files
  • Automated Install Scripts: One-click installation with Windows batch files
  • Complete Manual Guide: Manual installation support when scripts fail

Air-gapped Requirements

  • OS: Windows 10+ (64bit)
  • CPU: x86_64 architecture
  • Memory: Minimum 4GB RAM (for BGE-M3 loading)
  • Storage: Minimum 5GB (Python + model + dependencies)
  • LLM: HCX-005 API (internal network host)

Offline Deployment Guide: OFFLINE_DEPLOYMENT.md
Manual Installation Guide: MANUAL_INSTALLATION_GUIDE.md

Key Features

  • Fast Installation & Execution: Quick start with minimal dependencies
  • Multi-LLM Support: HCX-005 (Naver CLOVA Studio) & Gemini (Google)
  • Local Embeddings: Offline embedding support via BGE-M3
  • Intelligent Metric Selection: Automatically selects 5 or 4 metrics based on ground truth availability
  • Complete Offline Support: Full air-gapped execution for closed networks
  • Data Storage: SQLite-based evaluation result storage and history management
  • Enhanced Reports: JSON, CSV, Markdown, Elasticsearch NDJSON format support
  • Security: Environment variable-based API key management

License

This project is provided under Apache License 2.0:

See the LICENSE file for details.

Usage

CLI Commands

# Run evaluation
ragtrace-lite evaluate data.json --llm hcx

# List available datasets
ragtrace-lite list-datasets

# Generate web dashboard
ragtrace-lite dashboard --open

# Check version
ragtrace-lite version

Python API

from ragtrace_lite import RAGTraceLite
from ragtrace_lite.config_loader import load_config

# Load configuration
config = load_config()

# Initialize RAGTraceLite
rag_trace = RAGTraceLite(config)

# Run evaluation
results = rag_trace.evaluate("your_data.json")

Environment Configuration

Create a .env file and set your API keys:

# HCX-005 (Naver CLOVA Studio)
CLOVA_STUDIO_API_KEY=nv-your-hcx-api-key

# Gemini (Google)
GEMINI_API_KEY=your-gemini-api-key

Supported Metrics

With Ground Truth Data (5 Metrics)

  • Context Recall: Recall of retrieved contexts
  • Context Precision: Precision of retrieved contexts
  • Answer Correctness: Correctness of the answer
  • Answer Relevancy: Relevance of the answer to the question
  • Answer Similarity: Similarity between generated and ground truth answers

Without Ground Truth Data (4 Metrics)

  • Context Relevancy: Relevance of context to the question
  • Answer Relevancy: Relevance of answer to the question
  • Faithfulness: How faithful the answer is to the context
  • Coherence: Logical coherence of the answer

Project Structure

ragtrace-lite/
├── src/ragtrace_lite/          # Source code
│   ├── __init__.py
│   ├── cli.py                  # CLI interface
│   ├── config_loader.py        # Configuration management
│   ├── evaluator.py            # Evaluation engine
│   ├── llm_factory.py          # LLM integration
│   ├── db_manager.py           # Database management
│   └── report_generator.py     # Report generation
├── tests/                      # Tests
├── scripts/                    # Utility scripts
├── data/                       # Sample data
├── config.yaml                 # Default configuration
└── pyproject.toml             # Project configuration

Contributing

Contributions are welcome! See CONTRIBUTING.md for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Support

Having issues or questions?

Acknowledgments

This project is based on the following open-source projects:

  • RAGAS - RAG evaluation framework
  • BGE-M3 - Multilingual embedding model
  • LangChain - LLM application framework

Changelog

See CHANGELOG.md for a full history of changes.


Made with ❤️ by ntts9990

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragtrace_lite-1.0.7.tar.gz (62.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragtrace_lite-1.0.7-py3-none-any.whl (67.7 kB view details)

Uploaded Python 3

File details

Details for the file ragtrace_lite-1.0.7.tar.gz.

File metadata

  • Download URL: ragtrace_lite-1.0.7.tar.gz
  • Upload date:
  • Size: 62.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for ragtrace_lite-1.0.7.tar.gz
Algorithm Hash digest
SHA256 33ee3e1a51efb77899107ba372764b0d0a8fa4573d384905ab1e4e3f0618a2c3
MD5 a7d000fed603d3866c0e76cce4dea4fb
BLAKE2b-256 eab631baf882815e391b55caaf1133e9790e4bda03035c069f0a559eb77b062b

See more details on using hashes here.

File details

Details for the file ragtrace_lite-1.0.7-py3-none-any.whl.

File metadata

  • Download URL: ragtrace_lite-1.0.7-py3-none-any.whl
  • Upload date:
  • Size: 67.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for ragtrace_lite-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 5d4d8f2ee1f053ba5b838da2bee5e901045913ea220aaf5fc008833aa9bdd005
MD5 2b17811bf55e10c952d043c9098bf24f
BLAKE2b-256 25e4c2c9cf6135251c02fc80266bf4fa51bb23321cb0f5736c3616556e13ed5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page