A Python package for RAG performance evaluation
Project description
Krag
Krag is a Python package designed for evaluating retrieval-augmented generation (RAG) systems. It provides tools to calculate various evaluation metrics such as hit rate, recall@k, precision@k, MRR (Mean Reciprocal Rank), MAP (Mean Average Precision), NDCG (Normalized Discounted Cumulative Gain), and more.
Krag는 RAG 시스템(Retrieval-Augmented Generation)을 평가하기 위해 설계된 Python 패키지입니다. Hit Rate, Recall@k, Precision@k, MRR(Mean Reciprocal Rank), MAP(Mean Average Precision), NDCG(Normalized Discounted Cumulative Gain) 등 다양한 평가 지표를 계산하는 도구를 제공합니다.
Installation / 설치 방법
You can install Krag using pip:
pip install krag
Usage / 사용 예시
Here is a simple example of how to use the KragDocument
and OfflineRetrievalEvaluators
classes provided by this package.
다음은 Krag 패키지에서 제공하는 KragDocument
및 OfflineRetrievalEvaluators
클래스를 사용하는 간단한 예제입니다.
from krag.document import KragDocument
from krag.evaluators import OfflineRetrievalEvaluators
# 각 쿼리에 대한 정답 문서
actual_docs = [
# Query 1
[
KragDocument(metadata={'id': 1}, page_content='1'),
KragDocument(metadata={'id': 2}, page_content='2'),
KragDocument(metadata={'id': 3}, page_content='3'),
],
# Query 2
[
KragDocument(metadata={'id': 4}, page_content='4'),
KragDocument(metadata={'id': 5}, page_content='5'),
KragDocument(metadata={'id': 6}, page_content='6'),
],
# Query 3
[
KragDocument(metadata={'id': 7}, page_content='7'),
KragDocument(metadata={'id': 8}, page_content='8'),
KragDocument(metadata={'id': 9}, page_content='9'),
],
]
# 각 쿼리에 대한 검색 결과
predicted_docs = [
# Query 1
[
KragDocument(metadata={'id': 1}, page_content='1'),
KragDocument(metadata={'id': 4}, page_content='4'),
KragDocument(metadata={'id': 7}, page_content='7'),
KragDocument(metadata={'id': 2}, page_content='2'),
KragDocument(metadata={'id': 5}, page_content='5'),
KragDocument(metadata={'id': 8}, page_content='8'),
KragDocument(metadata={'id': 3}, page_content='3'),
KragDocument(metadata={'id': 6}, page_content='6'),
KragDocument(metadata={'id': 9}, page_content='9')
],
# Query 2
[
KragDocument(metadata={'id': 4}, page_content='4'),
KragDocument(metadata={'id': 1}, page_content='1'),
KragDocument(metadata={'id': 7}, page_content='7'),
KragDocument(metadata={'id': 5}, page_content='5'),
KragDocument(metadata={'id': 2}, page_content='2'),
KragDocument(metadata={'id': 8}, page_content='8'),
KragDocument(metadata={'id': 6}, page_content='6'),
KragDocument(metadata={'id': 3}, page_content='3'),
KragDocument(metadata={'id': 9}, page_content='9')
],
# Query 3
[
KragDocument(metadata={'id': 7}, page_content='7'),
KragDocument(metadata={'id': 2}, page_content='2'),
KragDocument(metadata={'id': 4}, page_content='4'),
KragDocument(metadata={'id': 8}, page_content='8'),
KragDocument(metadata={'id': 5}, page_content='5'),
KragDocument(metadata={'id': 3}, page_content='3'),
KragDocument(metadata={'id': 9}, page_content='9'),
KragDocument(metadata={'id': 6}, page_content='6'),
KragDocument(metadata={'id': 1}, page_content='1')
]
]
# Initialize the evaluator / 평가도구 초기화
evaluator = OfflineRetrievalEvaluators(actual_docs, predicted_docs, match_method="text")
# Calculate evaluation metrics / 평가지표 계산
hit_rate = evaluator.calculate_hit_rate()
mrr = evaluator.calculate_mrr()
recall_at_3 = evaluator.calculate_recall_k(k=3)
precision_at_5 = evaluator.calculate_precision_k(k=5)
map_at_5 = evaluator.calculate_map_k(k=5)
ndcg_at_5 = evaluator.ndcg_at_k(k=5)
# Print results / 결과 출력
print(f"Hit Rate: {hit_rate}")
print(f"MRR: {mrr}")
print(f"Recall@3: {recall_at_3}")
print(f"Precision@5: {precision_at_5}")
print(f"MAP@5: {map_at_5}")
print(f"NDCG@5: {ndcg_at_5}")
Key Features / 주요 기능
-
Document Matching (문서 매칭):
- The evaluator provides multiple methods to match actual and predicted documents, including exact text match and ROUGE-based matching (
rouge1
,rouge2
,rougeL
). - 평가자는 실제 문서와 예측된 문서를 매칭하기 위한 여러 가지 방법을 제공합니다. 여기에는 정확한 텍스트 매칭과 ROUGE 기반 매칭(
rouge1
,rouge2
,rougeL
)이 포함됩니다.
- The evaluator provides multiple methods to match actual and predicted documents, including exact text match and ROUGE-based matching (
-
Evaluation Metrics (평가지표):
- Hit Rate: Measures the proportion of actual documents correctly identified in the predicted set.
- Hit Rate (적중률): 예측된 문서 집합에서 실제 문서가 올바르게 식별된 비율을 측정합니다.
- Recall@k: Evaluates how many relevant documents are present in the top-k predictions.
- Recall@k: 상위 k개의 예측에서 얼마나 많은 관련 문서가 포함되었는지를 평가합니다.
- Precision@k: Evaluates the precision of the top-k predictions.
- Precision@k: 상위 k개의 예측의 정밀도를 평가합니다.
- MRR (Mean Reciprocal Rank): Averages the reciprocal of the rank of the first relevant document.
- MRR (Mean Reciprocal Rank, 평균 역순위): 첫 번째 관련 문서의 순위의 역수를 평균내어 계산합니다.
- MAP@k (Mean Average Precision at k): Averages precision across top-k ranks where relevant documents appear.
- MAP@k (Mean Average Precision at k): 상위 k위 안에 관련 문서가 등장하는 순위에서의 정밀도를 평균냅니다.
- NDCG@k (Normalized Discounted Cumulative Gain at k): Evaluates the ranking quality considering the order of documents based on relevance scores, with softmax normalization applied when using ROUGE scores.
- NDCG@k (Normalized Discounted Cumulative Gain at k): 관련성 점수를 바탕으로 문서 순서를 고려하여 순위 품질을 평가합니다.
-
ROUGE Score Matching (ROUGE 점수 매칭):
- The
RougeOfflineRetrievalEvaluators
class extends the basic evaluator to use ROUGE scores (rouge1
,rouge2
,rougeL
) for matching and evaluating retrieval quality. RougeOfflineRetrievalEvaluators
클래스는 기본 평가자 기능을 확장하여 ROUGE 점수(rouge1
,rouge2
,rougeL
)를 사용한 매칭과 검색 품질 평가를 수행합니다.
- The
Example with ROUGE Matching / ROUGE 매칭 사용 예제
from krag.document import KragDocument
from krag.evaluators import RougeOfflineRetrievalEvaluators
# Initialize the evaluator with ROUGE matching / ROUGE 매칭을 사용한 평가도구 초기화
evaluator = RougeOfflineRetrievalEvaluators(actual_docs, predicted_docs, match_method="rouge1", threshold=0.8)
# Calculate evaluation metrics / 평가지표 계산
hit_rate = evaluator.calculate_hit_rate()
mrr = evaluator.calculate_mrr()
recall_at_3 = evaluator.calculate_recall_k(k=3)
precision_at_5 = evaluator.calculate_precision_k(k=5)
map_at_5 = evaluator.calculate_map_k(k=5)
ndcg_at_5 = evaluator.ndcg_at_k(k=5)
# Print results / 결과 출력
print(f"Hit Rate: {hit_rate}")
print(f"MRR: {mrr}")
print(f"Recall@3: {recall_at_3}")
print(f"Precision@5: {precision_at_5}")
print(f"MAP@5: {map_at_5}")
print(f"NDCG@5: {ndcg_at_5}")
License
This project is licensed under the MIT License - see the [MIT License
](https://opensource.org/licenses/MIT) for more details.
Contact
If you have any questions, feel free to reach out via email.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.