Skip to main content

Lightweight RAG evaluation framework with Korean language support, BGE-M3 embeddings, and HCX-005/Gemini LLMs

Project description

RAGTrace Lite

A lightweight RAG (Retrieval-Augmented Generation) evaluation framework with Korean language support

한국어 버전: README_KO.md

PyPI version Python Support License: Apache 2.0

Overview

RAGTrace Lite is a lightweight framework for evaluating RAG system performance. Built on the RAGAS framework and optimized for Korean language environments.

Key Features:

  • Intelligent Metric Selection: Automatically selects 5 or 4 metrics based on ground truth data availability
  • Local BGE-M3 Embeddings: Offline embedding support for air-gapped environments
  • Multi-LLM Support: HCX-005 (Naver CLOVA Studio) and Gemini (Google)
  • Offline Deployment: Complete air-gapped deployment for closed networks
  • Korean Language Optimized: Native Korean language support

Quick Start

Installation from PyPI (Recommended)

# Basic installation
pip install ragtrace-lite

# Full installation (LLM + Embeddings + Enhanced features)
pip install "ragtrace-lite[all]"

# Optional installations
pip install "ragtrace-lite[llm]"        # LLM support only
pip install "ragtrace-lite[embeddings]" # Local embeddings only

Development Installation

# Clone repository and install in development mode
git clone https://github.com/ntts9990/ragtrace-lite.git
cd ragtrace-lite

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .[all]

API Key Configuration

Create a .env file and add your API keys:

CLOVA_STUDIO_API_KEY=nv-your-hcx-api-key
GEMINI_API_KEY=your-gemini-api-key

Run Sample Evaluation

# Run evaluation with BGE-M3 + HCX
ragtrace-lite evaluate data/sample_data.json --llm hcx

# Generate web dashboard
ragtrace-lite dashboard --open

Platform Support

  • Windows 10+ (PowerShell/CMD)
  • macOS 10.15+ (Intel/Apple Silicon)
  • Linux Ubuntu 18.04+
  • Python 3.9, 3.10, 3.11, 3.12

GPU Support: CUDA (Linux), MPS (Apple Silicon), CPU (All platforms)

Detailed Setup Guide: SETUP.md

Offline Deployment

RAGTrace Lite supports complete offline execution in air-gapped environments.

Quick Offline Deployment

# 1. Create deployment package (internet-connected environment)
python scripts/prepare_offline_deployment.py

# 2. Copy generated ZIP file to air-gapped PC
# dist/ragtrace-lite-offline-YYYYMMDD-HHMMSS.zip

# 3. Extract and install in air-gapped environment
scripts/install.bat

# 4. Run evaluation
scripts/run_evaluation.bat

Offline Support Features

  • Python 3.11 Auto-Install: Windows installer included
  • BGE-M3 Local Model: 2.3GB embedding model pre-downloaded
  • All Dependencies Included: Complete offline installation with wheel files
  • Automated Install Scripts: One-click installation with Windows batch files
  • Complete Manual Guide: Manual installation support when scripts fail

Air-gapped Requirements

  • OS: Windows 10+ (64bit)
  • CPU: x86_64 architecture
  • Memory: Minimum 4GB RAM (for BGE-M3 loading)
  • Storage: Minimum 5GB (Python + model + dependencies)
  • LLM: HCX-005 API (internal network host)

Offline Deployment Guide: OFFLINE_DEPLOYMENT.md
Manual Installation Guide: MANUAL_INSTALLATION_GUIDE.md

Key Features

  • Fast Installation & Execution: Quick start with minimal dependencies
  • Multi-LLM Support: HCX-005 (Naver CLOVA Studio) & Gemini (Google)
  • Local Embeddings: Offline embedding support via BGE-M3
  • Intelligent Metric Selection: Automatically selects 5 or 4 metrics based on ground truth availability
  • Complete Offline Support: Full air-gapped execution for closed networks
  • Data Storage: SQLite-based evaluation result storage and history management
  • Enhanced Reports: JSON, CSV, Markdown, Elasticsearch NDJSON format support
  • Security: Environment variable-based API key management

License

This project is provided under Apache License 2.0:

See the LICENSE file for details.

Usage

CLI Commands

# Run evaluation
ragtrace-lite evaluate data.json --llm hcx

# List available datasets
ragtrace-lite list-datasets

# Generate web dashboard
ragtrace-lite dashboard --open

# Check version
ragtrace-lite version

Python API

from ragtrace_lite import RAGTraceLite
from ragtrace_lite.config_loader import load_config

# Load configuration
config = load_config()

# Initialize RAGTraceLite
rag_trace = RAGTraceLite(config)

# Run evaluation
results = rag_trace.evaluate("your_data.json")

Environment Configuration

Create a .env file and set your API keys:

# HCX-005 (Naver CLOVA Studio)
CLOVA_STUDIO_API_KEY=nv-your-hcx-api-key

# Gemini (Google)
GEMINI_API_KEY=your-gemini-api-key

Supported Metrics

With Ground Truth Data (5 Metrics)

  • Context Recall: Recall of retrieved contexts
  • Context Precision: Precision of retrieved contexts
  • Answer Correctness: Correctness of the answer
  • Answer Relevancy: Relevance of the answer to the question
  • Answer Similarity: Similarity between generated and ground truth answers

Without Ground Truth Data (4 Metrics)

  • Context Relevancy: Relevance of context to the question
  • Answer Relevancy: Relevance of answer to the question
  • Faithfulness: How faithful the answer is to the context
  • Coherence: Logical coherence of the answer

Project Structure

ragtrace-lite/
├── src/ragtrace_lite/          # Source code
│   ├── __init__.py
│   ├── cli.py                  # CLI interface
│   ├── config_loader.py        # Configuration management
│   ├── evaluator.py            # Evaluation engine
│   ├── llm_factory.py          # LLM integration
│   ├── db_manager.py           # Database management
│   └── report_generator.py     # Report generation
├── tests/                      # Tests
├── scripts/                    # Utility scripts
├── data/                       # Sample data
├── config.yaml                 # Default configuration
└── pyproject.toml             # Project configuration

Contributing

Contributions are welcome! See CONTRIBUTING.md for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Support

Having issues or questions?

Acknowledgments

This project is based on the following open-source projects:

  • RAGAS - RAG evaluation framework
  • BGE-M3 - Multilingual embedding model
  • LangChain - LLM application framework

Changelog

See CHANGELOG.md for a full history of changes.


Made with ❤️ by ntts9990

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragtrace_lite-1.0.4.tar.gz (62.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragtrace_lite-1.0.4-py3-none-any.whl (67.6 kB view details)

Uploaded Python 3

File details

Details for the file ragtrace_lite-1.0.4.tar.gz.

File metadata

  • Download URL: ragtrace_lite-1.0.4.tar.gz
  • Upload date:
  • Size: 62.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for ragtrace_lite-1.0.4.tar.gz
Algorithm Hash digest
SHA256 42fbd47a1b46ceb574771df0132d7cc95924314103be17d1a2edb021a8298449
MD5 acecacbb6c88ba04144fc07e26f80ff1
BLAKE2b-256 e3e3bf52ce2311a6a0801db1831f6d863472322be99f0640729a91fd19e26317

See more details on using hashes here.

File details

Details for the file ragtrace_lite-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: ragtrace_lite-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 67.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for ragtrace_lite-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5434049c6555de501a40a40fa1f6d081ca07ee44acea65e0d54d0e54b7d26bb6
MD5 90d90060f56b1a93101c7be2529b281a
BLAKE2b-256 4b130c0ac144c974b68aa43d3af56b16e2a6a2fdc537d344d4badef7fd94cccf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page