Generate QA datasets & evaluate RAG systems in 2 commands. Privacy-first, works with any LLM (local or cloud). Async, fast, zero config.
Project description
Generate QA datasets & evaluate RAG systems in 2 commands
🔒 Privacy-First • ⚡ Async & Fast • 🤖 Any LLM • 🏠 Local or Cloud
⚡ 2-Line RAG Evaluation
# Step 1: Generate QA pairs from your docs
ragscore generate docs/
# Step 2: Evaluate your RAG system
ragscore evaluate http://localhost:8000/query
That's it. Get accuracy scores and incorrect QA pairs instantly.
============================================================
✅ EXCELLENT: 85/100 correct (85.0%)
Average Score: 4.20/5.0
============================================================
❌ 15 Incorrect Pairs:
1. Q: "What is RAG?"
Score: 2/5 - Factually incorrect
2. Q: "How does retrieval work?"
Score: 3/5 - Incomplete answer
🚀 Quick Start
Install
pip install ragscore # Core (works with Ollama)
pip install "ragscore[openai]" # + OpenAI support
pip install "ragscore[all]" # + All providers
Generate QA Pairs
# Set API key (or use local Ollama - no key needed!)
export OPENAI_API_KEY="sk-..."
# Generate from any document
ragscore generate paper.pdf
ragscore generate docs/*.pdf --concurrency 10
Evaluate Your RAG
# Point to your RAG endpoint
ragscore evaluate http://localhost:8000/query
# Custom options
ragscore evaluate http://api/ask --model gpt-4o --output results.json
🏠 100% Private with Local LLMs
# Use Ollama - no API keys, no cloud, 100% private
ollama pull llama3.1
ragscore generate confidential_docs/*.pdf
ragscore evaluate http://localhost:8000/query
Perfect for: Healthcare 🏥 • Legal ⚖️ • Finance 🏦 • Research 🔬
🔌 Supported LLMs
| Provider | Setup | Notes |
|---|---|---|
| Ollama | ollama serve |
Local, free, private |
| OpenAI | export OPENAI_API_KEY="sk-..." |
Best quality |
| Anthropic | export ANTHROPIC_API_KEY="..." |
Long context |
| DashScope | export DASHSCOPE_API_KEY="..." |
Qwen models |
| vLLM | export LLM_BASE_URL="..." |
Production-grade |
| Any OpenAI-compatible | export LLM_BASE_URL="..." |
Groq, Together, etc. |
📊 Output Formats
Generated QA Pairs (output/generated_qas.jsonl)
{
"id": "abc123",
"question": "What is RAG?",
"answer": "RAG (Retrieval-Augmented Generation) combines...",
"rationale": "This is explicitly stated in the introduction...",
"support_span": "RAG systems retrieve relevant documents...",
"difficulty": "medium",
"source_path": "docs/rag_intro.pdf"
}
Evaluation Results (--output results.json)
{
"summary": {
"total": 100,
"correct": 85,
"incorrect": 15,
"accuracy": 0.85,
"avg_score": 4.2
},
"incorrect_pairs": [
{
"question": "What is RAG?",
"golden_answer": "RAG combines retrieval with generation...",
"rag_answer": "RAG is a database system.",
"score": 2,
"reason": "Factually incorrect - RAG is not a database"
}
]
}
🧪 Python API
from ragscore import run_pipeline, run_evaluation
# Generate QA pairs
run_pipeline(paths=["docs/"], concurrency=10)
# Evaluate RAG
results = run_evaluation(
endpoint="http://localhost:8000/query",
model="gpt-4o", # LLM for judging
)
print(f"Accuracy: {results.accuracy:.1%}")
🤖 AI Agent Integration
RAGScore is designed for AI agents and automation:
# Structured CLI with predictable output
ragscore generate docs/ --concurrency 5
ragscore evaluate http://api/query --output results.json
# Exit codes: 0 = success, 1 = error
# JSON output for programmatic parsing
CLI Reference:
| Command | Description |
|---|---|
ragscore generate <paths> |
Generate QA pairs from documents |
ragscore evaluate <endpoint> |
Evaluate RAG against golden QAs |
ragscore --help |
Show all commands and options |
ragscore generate --help |
Show generate options |
ragscore evaluate --help |
Show evaluate options |
⚙️ Configuration
Zero config required. Optional environment variables:
export RAGSCORE_CHUNK_SIZE=512 # Chunk size for documents
export RAGSCORE_QUESTIONS_PER_CHUNK=5 # QAs per chunk
export RAGSCORE_WORK_DIR=/path/to/dir # Working directory
🔐 Privacy & Security
| Data | Cloud LLM | Local LLM |
|---|---|---|
| Documents | ✅ Local | ✅ Local |
| Text chunks | ⚠️ Sent to LLM | ✅ Local |
| Generated QAs | ✅ Local | ✅ Local |
| Evaluation results | ✅ Local | ✅ Local |
Compliance: GDPR ✅ • HIPAA ✅ (with local LLMs) • SOC 2 ✅
🧪 Development
git clone https://github.com/HZYAI/RagScore.git
cd RagScore
pip install -e ".[dev,all]"
pytest
🔗 Links
- GitHub • PyPI • Issues • Discussions
⭐ Star us on GitHub if RAGScore helps you!
Made with ❤️ for the RAG community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragscore-0.5.1.tar.gz.
File metadata
- Download URL: ragscore-0.5.1.tar.gz
- Upload date:
- Size: 39.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f10520c576a89d5bca943d632ab9136c6d2c37efc4ea591dc989a7f812922a4c
|
|
| MD5 |
9b593ae759eefaa42d214c9e6d7ce350
|
|
| BLAKE2b-256 |
d832caefd069ff9073f6876cbd6a313dfe1394633f2d2fb09a74a761893d5fca
|
Provenance
The following attestation bundles were made for ragscore-0.5.1.tar.gz:
Publisher:
ci.yml on HZYAI/RagScore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ragscore-0.5.1.tar.gz -
Subject digest:
f10520c576a89d5bca943d632ab9136c6d2c37efc4ea591dc989a7f812922a4c - Sigstore transparency entry: 842967833
- Sigstore integration time:
-
Permalink:
HZYAI/RagScore@af8e8467fc01bc98320638f79c99bbe901570bbd -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/HZYAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@af8e8467fc01bc98320638f79c99bbe901570bbd -
Trigger Event:
push
-
Statement type:
File details
Details for the file ragscore-0.5.1-py3-none-any.whl.
File metadata
- Download URL: ragscore-0.5.1-py3-none-any.whl
- Upload date:
- Size: 39.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd19c93e4f5c714e6fe7d5702248740cb8883eedc4f0690897019bf6f51433ab
|
|
| MD5 |
d4148f26faccb80c4c3acf25c25b818c
|
|
| BLAKE2b-256 |
cd6856f5a186269d5b3d6fb7ee967e031bcdef833d404a016e924662c0110f1e
|
Provenance
The following attestation bundles were made for ragscore-0.5.1-py3-none-any.whl:
Publisher:
ci.yml on HZYAI/RagScore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ragscore-0.5.1-py3-none-any.whl -
Subject digest:
cd19c93e4f5c714e6fe7d5702248740cb8883eedc4f0690897019bf6f51433ab - Sigstore transparency entry: 842967859
- Sigstore integration time:
-
Permalink:
HZYAI/RagScore@af8e8467fc01bc98320638f79c99bbe901570bbd -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/HZYAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@af8e8467fc01bc98320638f79c99bbe901570bbd -
Trigger Event:
push
-
Statement type: