RAG evaluation system using Ragas with Phoenix/Langfuse tracing
Project description
EvalVault
RAG(Retrieval-Augmented Generation) 시스템을 대상으로 평가(Eval) → 분석(Analysis) → 추적(Tracing) → 개선 루프를 하나의 워크플로로 묶는 CLI + Web UI 플랫폼입니다.
English version? See README.en.md.
Quickstart (CLI)
uv sync --extra dev
cp .env.example .env
uv run evalvault run --mode simple tests/fixtures/e2e/insurance_qa_korean.json \
--metrics faithfulness,answer_relevancy \
--profile dev \
--auto-analyze
Tip: 기본 저장소는 Postgres+pgvector입니다. SQLite를 쓰려면 --db 또는 DB_BACKEND=sqlite + EVALVAULT_DB_PATH를 지정하세요.
핵심 기능
- End-to-End 평가 루프: Eval → Analysis → Tracing → Improvement를 한 흐름으로 실행
- Dataset 중심 운영: 합격 기준(threshold)을 데이터셋에 유지
- Artifacts-first: 보고서뿐 아니라 모듈별 원본 결과를 구조화 저장
- 옵션형 Observability: Phoenix/Langfuse/MLflow는 필요할 때만 활성화
- CLI + Web UI: 동일 run_id 기반으로 히스토리/비교/리포트 통합
문서 허브
- 문서 인덱스:
docs/INDEX.md - 핸드북(교과서형):
docs/handbook/INDEX.md - 외부 요약본:
docs/handbook/EXTERNAL.md - 운영 가이드(로컬/도커/관측/런북):
docs/handbook/CHAPTERS/04_operations.md - 워크플로(실행/분석/비교/회귀):
docs/handbook/CHAPTERS/03_workflows.md - 품질/테스트/CI:
docs/handbook/CHAPTERS/06_quality_and_testing.md - 아키텍처:
docs/handbook/CHAPTERS/01_architecture.md - 오프라인/폐쇄망(Docker/모델 캐시):
docs/guides/OFFLINE_DOCKER.md,docs/guides/OFFLINE_MODELS.md
참고(호환성): docs/guides/USER_GUIDE.md, docs/guides/DEV_GUIDE.md 등 일부 문서는 과거 링크 호환을 위한 deprecated 스텁이며, 최신 내용은 handbook을 따릅니다.
Web UI
# API
uv run evalvault serve-api --reload
# Frontend
cd frontend
npm install
npm run dev
브라우저에서 http://localhost:5173 접속 후, Evaluation Studio에서 실행/히스토리/리포트를 확인합니다.
오프라인/폐쇄망
- Docker 이미지 번들:
docs/guides/OFFLINE_DOCKER.md - NLP 모델 캐시 번들:
docs/guides/OFFLINE_MODELS.md
LLM 모델은 폐쇄망 내부 인프라가 관리하며, EvalVault는 분석용 NLP 모델 캐시만 번들에 포함합니다.
기여
uv run ruff check src/ tests/
uv run ruff format src/ tests/
uv run pytest tests -v
- 기여 가이드:
CONTRIBUTING.md - 개발/테스트 루틴:
AGENTS.md,docs/handbook/CHAPTERS/06_quality_and_testing.md
License
EvalVault is licensed under the Apache 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evalvault-1.77.0.tar.gz.
File metadata
- Download URL: evalvault-1.77.0.tar.gz
- Upload date:
- Size: 2.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2fceef3b5a1b8c12c75d316440443a66107d5c7ea1fd4939c1fcf9f99822e41a
|
|
| MD5 |
2372f2530639b02d4ca5c8c088d10916
|
|
| BLAKE2b-256 |
681990646d31e58f5559c72f2c55f06a4ac0ddeff1bc480ab8d4e5773fb12a0d
|
Provenance
The following attestation bundles were made for evalvault-1.77.0.tar.gz:
Publisher:
release.yml on ntts9990/EvalVault
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
evalvault-1.77.0.tar.gz -
Subject digest:
2fceef3b5a1b8c12c75d316440443a66107d5c7ea1fd4939c1fcf9f99822e41a - Sigstore transparency entry: 913797641
- Sigstore integration time:
-
Permalink:
ntts9990/EvalVault@c9074a7e79019dd5ae7804234a2ec3a301c5bfe0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ntts9990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c9074a7e79019dd5ae7804234a2ec3a301c5bfe0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file evalvault-1.77.0-py3-none-any.whl.
File metadata
- Download URL: evalvault-1.77.0-py3-none-any.whl
- Upload date:
- Size: 893.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
928b733fbc91a67d680b6903d44bd7f4a0893a15a3d49bb3435bc460bbbbdfd7
|
|
| MD5 |
7085197eb34f4868a87566b00acf6c09
|
|
| BLAKE2b-256 |
155673b55d6e4919878e3fa08399167e68d2cd73de93601c0dc061993c09eeda
|
Provenance
The following attestation bundles were made for evalvault-1.77.0-py3-none-any.whl:
Publisher:
release.yml on ntts9990/EvalVault
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
evalvault-1.77.0-py3-none-any.whl -
Subject digest:
928b733fbc91a67d680b6903d44bd7f4a0893a15a3d49bb3435bc460bbbbdfd7 - Sigstore transparency entry: 913797701
- Sigstore integration time:
-
Permalink:
ntts9990/EvalVault@c9074a7e79019dd5ae7804234a2ec3a301c5bfe0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ntts9990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c9074a7e79019dd5ae7804234a2ec3a301c5bfe0 -
Trigger Event:
push
-
Statement type: