ragscore

Privacy-first QA dataset generator for RAG evaluation. Works with local LLMs (Ollama, vLLM) or cloud providers.

These details have not been verified by PyPI

Project description

Generate high-quality QA datasets to evaluate your RAG systems

🔒 Privacy-First • ⚡ Lightweight • 🤖 Multi-Provider • 🏠 Local LLM Support

English | 中文 | 日本語

🌟 Why RAGScore?

Privacy-First Architecture

🔒 No embeddings - Your documents never leave your machine
🏠 Local LLM support - Use Ollama, vLLM, or any local model
🔐 GDPR/HIPAA compliant - Perfect for sensitive data
✅ Zero external API calls during document processing

Lightweight & Fast

⚡ 50 MB install - 90% smaller than alternatives (500MB+)
🚀 No heavy ML dependencies - No PyTorch, no TensorFlow
💨 Quick startup - Ready in seconds, not minutes

True Multi-Provider

🤖 Auto-detection - Just set API key, we handle the rest
🔄 Switch instantly - Change providers without code changes
🌐 Works with everything - OpenAI, Anthropic, Groq, Ollama, vLLM, and more

Developer-Friendly

📄 File or directory - Process single files, multiple files, or folders
🎯 Zero configuration - No config files, no setup scripts

🏠 Local LLMs: 100% Private, 100% Free

Perfect for:

🏢 Enterprises with sensitive data (financial, medical, legal)
🔬 Researchers processing confidential papers
💰 Cost-conscious users who want zero API fees
🌍 Offline environments without internet access

Option 1: Ollama (Recommended - Easiest)

# 1. Install Ollama
brew install ollama  # or visit https://ollama.ai

# 2. Pull a model
ollama pull llama3.1        # 4.7 GB, great quality
# or
ollama pull qwen2.5:7b      # 4.7 GB, excellent for QA
# or
ollama pull llama3.1:70b    # 40 GB, best quality

# 3. Start Ollama
ollama serve

# 4. Use RAGScore (auto-detects Ollama!)
ragscore generate paper.pdf

That's it! No API keys, no configuration, 100% private.

Option 2: vLLM (For Production)

# 1. Install vLLM
pip install vllm

# 2. Start server with your model
vllm serve meta-llama/Llama-3.1-8B-Instruct \
  --host 0.0.0.0 \
  --port 8000

# 3. Point RAGScore to it
export LLM_BASE_URL="http://localhost:8000/v1"
ragscore generate paper.pdf

Option 3: LM Studio (GUI)

Download LM Studio
Load a model (llama-3.1, qwen-2.5, etc.)
Start local server
Use with RAGScore (auto-detected!)

🚀 Quick Start

Cloud LLMs (Fast, Requires API Key)

# 1. Install
pip install "ragscore[openai]"  # or [anthropic], [dashscope]

# 2. Set API key
export OPENAI_API_KEY="sk-..."

# 3. Generate QA pairs
ragscore generate paper.pdf

Local LLMs (Private, No API Key)

# 1. Install
pip install ragscore

# 2. Start Ollama
ollama pull llama3.1 && ollama serve

# 3. Generate QA pairs (100% private!)
ragscore generate paper.pdf

📖 Usage Examples

Single File

ragscore generate paper.pdf

Multiple Files

ragscore generate paper.pdf report.txt notes.md

Glob Patterns

ragscore generate *.pdf
ragscore generate docs/**/*.md

Mix Everything

ragscore generate paper.pdf ./more_docs/ *.txt

🔌 Supported Providers

Cloud Providers

Provider	Setup	Notes
OpenAI	`export OPENAI_API_KEY="sk-..."`	Best quality, widely used
Anthropic	`export ANTHROPIC_API_KEY="sk-ant-..."`	Long context (200K tokens)
DashScope	`export DASHSCOPE_API_KEY="..."`	Qwen models

See each provider's website for current pricing and features.

Local Providers (Private & Free!)

Provider	Setup	Notes
Ollama	`ollama serve`	Easiest setup, great for getting started
vLLM	`vllm serve model`	Production-grade, high performance
LM Studio	GUI app	User-friendly interface
llama.cpp	`./server -m model.gguf`	Lightweight, runs on CPU
LocalAI	Docker container	OpenAI-compatible API

Switch Providers Instantly

# Monday: Use OpenAI
export OPENAI_API_KEY="sk-..."
ragscore generate paper.pdf

# Tuesday: Switch to local (more private!)
unset OPENAI_API_KEY
ollama serve
ragscore generate paper.pdf  # Same command!

# Wednesday: Try Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
ragscore generate paper.pdf  # Still same command!

🎯 Use Cases

Privacy-Sensitive Industries

Perfect for organizations handling confidential data:

🏥 Healthcare - Process medical documents locally
⚖️ Legal - Analyze case files without cloud exposure
🏦 Finance - Generate QA from internal reports
🔬 Research - Work with unpublished papers
🏢 Enterprise - Handle proprietary documentation

General Applications

📚 RAG Evaluation - Generate test datasets for your RAG system
🎓 Documentation - Create QA pairs from technical docs
🤖 Fine-tuning - Generate training data for model fine-tuning
📊 Knowledge Management - Extract Q&A from company knowledge bases

All use cases work with both cloud and local LLMs!

# Example: Process documents locally for privacy
ollama pull llama3.1
ragscore generate confidential_docs/*.pdf
# ✅ Data never leaves your infrastructure

# Example: Use cloud LLM for best quality
export OPENAI_API_KEY="sk-..."
ragscore generate research_papers/*.pdf
# ✅ High-quality QA generation

📊 Output Format

{
  "id": "abc123",
  "question": "What is RAG?",
  "answer": "RAG (Retrieval-Augmented Generation) combines...",
  "rationale": "This is explicitly stated in the introduction...",
  "support_span": "RAG systems retrieve relevant documents...",
  "difficulty": "easy",
  "doc_id": "xyz789",
  "source_path": "docs/rag_intro.pdf"
}

🚀 From Generation to Audit (RAGScore Pro)

You've generated 1,000 QA pairs. Now what?

Generating the data is Step 1. Step 2 is proving to your auditors that your RAG system is safe.

RAGScore Pro (Enterprise) connects to your generated dataset to provide:

🕵️ Hallucination Detection - Did your RAG make things up?
📉 Regression Testing - Did your latest prompt change break 20% of your answers?
🏢 Team Dashboards - Share accuracy reports with stakeholders
📊 Multi-dimensional Scoring - Accuracy, relevance, completeness
⚡ CI/CD Integration - Automated evaluation in your pipeline

Sign Up for Waitlist →

🧪 Python API

from ragscore import run_pipeline, generate_qa_for_chunk
from ragscore.providers import get_provider

# Simple usage
run_pipeline(paths=["paper.pdf", "report.txt"])

# Use local Ollama
provider = get_provider("ollama", model="llama3.1")
qas = generate_qa_for_chunk(
    chunk_text="Your text here...",
    difficulty="hard",
    n=5,
    provider=provider
)

# Use local vLLM
provider = get_provider(
    "openai",  # vLLM is OpenAI-compatible
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)
qas = generate_qa_for_chunk(
    chunk_text="Your text here...",
    difficulty="medium",
    n=3,
    provider=provider
)

⚙️ Configuration

RAGScore works with zero configuration, but you can customize:

# Optional: Customize chunk size
export RAGSCORE_CHUNK_SIZE=512

# Optional: Questions per chunk
export RAGSCORE_QUESTIONS_PER_CHUNK=5

# Optional: Working directory
export RAGSCORE_WORK_DIR=/path/to/workspace

🔐 Privacy & Security

What Data Stays Local?

✅ Your documents - Never sent to embedding APIs
✅ Document chunks - Processed locally
✅ File metadata - Stays on your machine

What Data is Sent to LLM?

⚠️ Text chunks only - Sent to LLM for QA generation
✅ With local LLMs - Even this stays on your machine!

Compliance

✅ GDPR compliant - No data sent to third parties (with local LLMs)
✅ HIPAA friendly - Use local LLMs for PHI
✅ SOC 2 ready - Full data control with local deployment

🧪 Development

# Clone repository
git clone https://github.com/HZYAI/RagScore.git
cd RagScore

# Install with dev dependencies
pip install -e ".[dev,all]"

# Run tests
pytest

# Run linting
ruff check src/
black --check src/

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

📄 License

Apache 2.0 License - see LICENSE for details.

🔗 Links

⭐ Star us on GitHub if RAGScore helps you!
Made with ❤️ for the RAG community

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.8.2

Apr 11, 2026

0.8.0

Mar 13, 2026

0.7.8

Mar 12, 2026

0.7.7

Feb 27, 2026

0.7.6

Feb 27, 2026

0.7.5

Feb 27, 2026

0.7.4

Feb 21, 2026

0.7.3

Feb 19, 2026

0.7.2

Feb 15, 2026

0.7.1

Feb 15, 2026

0.7.0

Feb 15, 2026

0.6.10

Feb 11, 2026

0.6.9

Feb 11, 2026

0.6.8

Feb 11, 2026

0.6.7

Feb 9, 2026

0.6.6

Feb 7, 2026

0.6.5

Feb 7, 2026

0.6.4

Feb 7, 2026

0.6.3

Jan 25, 2026

0.6.2

Jan 25, 2026

0.6.1

Jan 24, 2026

0.6.0

Jan 24, 2026

0.5.2

Jan 23, 2026

0.5.1

Jan 21, 2026

0.4.7

Jan 4, 2026

0.4.6

Jan 4, 2026

This version

0.4.4

Jan 3, 2026

0.4.3

Dec 28, 2025

0.4.0

Dec 27, 2025

0.3.1

Dec 27, 2025

0.2.2

Dec 27, 2025

0.1.4

Dec 27, 2025

0.1.3

Dec 27, 2025

0.1.2

Dec 26, 2025

0.1.1

Dec 26, 2025

0.1.0

Dec 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragscore-0.4.4.tar.gz (36.2 kB view details)

Uploaded Jan 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragscore-0.4.4-py3-none-any.whl (34.3 kB view details)

Uploaded Jan 3, 2026 Python 3

File details

Details for the file ragscore-0.4.4.tar.gz.

File metadata

Download URL: ragscore-0.4.4.tar.gz
Upload date: Jan 3, 2026
Size: 36.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragscore-0.4.4.tar.gz
Algorithm	Hash digest
SHA256	`9e1ec9f4526a00fc21c8431a19109df3691d22c243ca063c99f50a437640d86c`
MD5	`00a60d221593527122045625ce04c471`
BLAKE2b-256	`21e0ec87291902010c3fd1bc60b756f640724e15d72f9db7f851f146d4168b9e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragscore-0.4.4.tar.gz:

Publisher: ci.yml on HZYAI/RagScore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragscore-0.4.4.tar.gz
- Subject digest: 9e1ec9f4526a00fc21c8431a19109df3691d22c243ca063c99f50a437640d86c
- Sigstore transparency entry: 789634016
- Sigstore integration time: Jan 3, 2026
Source repository:
- Permalink: HZYAI/RagScore@da751e65f64f5c67c7362a27babad28e6bf80737
- Branch / Tag: refs/tags/v0.4.4
- Owner: https://github.com/HZYAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@da751e65f64f5c67c7362a27babad28e6bf80737
- Trigger Event: push

File details

Details for the file ragscore-0.4.4-py3-none-any.whl.

File metadata

Download URL: ragscore-0.4.4-py3-none-any.whl
Upload date: Jan 3, 2026
Size: 34.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragscore-0.4.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70bf599afaf888314d1305863a67b119314dc3e5faf6d80934dd88a2f5341e84`
MD5	`adf275ed07a222447bde1b2cf8fb196a`
BLAKE2b-256	`28150ad5d4e1bc7462296f6e27a67761989c161b2537dedf41fac822b5c5b74f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragscore-0.4.4-py3-none-any.whl:

Publisher: ci.yml on HZYAI/RagScore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragscore-0.4.4-py3-none-any.whl
- Subject digest: 70bf599afaf888314d1305863a67b119314dc3e5faf6d80934dd88a2f5341e84
- Sigstore transparency entry: 789634017
- Sigstore integration time: Jan 3, 2026
Source repository:
- Permalink: HZYAI/RagScore@da751e65f64f5c67c7362a27babad28e6bf80737
- Branch / Tag: refs/tags/v0.4.4
- Owner: https://github.com/HZYAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@da751e65f64f5c67c7362a27babad28e6bf80737
- Trigger Event: push

ragscore 0.4.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🌟 Why RAGScore?

Privacy-First Architecture

Lightweight & Fast

True Multi-Provider

Developer-Friendly

🏠 Local LLMs: 100% Private, 100% Free

Option 1: Ollama (Recommended - Easiest)

Option 2: vLLM (For Production)

Option 3: LM Studio (GUI)

🚀 Quick Start

Cloud LLMs (Fast, Requires API Key)

Local LLMs (Private, No API Key)

📖 Usage Examples

Single File

Multiple Files

Glob Patterns

Directory

Mix Everything

🔌 Supported Providers

Cloud Providers

Local Providers (Private & Free!)

Switch Providers Instantly

🎯 Use Cases

Privacy-Sensitive Industries

General Applications

📊 Output Format

🚀 From Generation to Audit (RAGScore Pro)

🧪 Python API

⚙️ Configuration

🔐 Privacy & Security

What Data Stays Local?

What Data is Sent to LLM?

Compliance

🧪 Development

🤝 Contributing

📄 License

🔗 Links

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance