Skip to main content

A Python package for RAG performance evaluation

Project description

Krag

Krag is a Python package designed for evaluating retrieval-augmented generation (RAG) systems. It provides tools to calculate various evaluation metrics such as hit rate, recall@k, precision@k, MRR (Mean Reciprocal Rank), MAP (Mean Average Precision), NDCG (Normalized Discounted Cumulative Gain), and more.

Krag는 RAG 시스템(Retrieval-Augmented Generation)을 평가하기 위해 설계된 Python 패키지입니다. Hit Rate, Recall@k, Precision@k, MRR(Mean Reciprocal Rank), MAP(Mean Average Precision), NDCG(Normalized Discounted Cumulative Gain) 등 다양한 평가 지표를 계산하는 도구를 제공합니다.

Installation / 설치 방법

You can install Krag using pip:

pip install krag

Krag는 pip을 통해 설치할 수 있습니다:

pip install krag

Usage / 사용 예시

Here is a simple example of how to use the KragDocument and OfflineRetrievalEvaluators classes provided by this package.

다음은 Krag 패키지에서 제공하는 KragDocumentOfflineRetrievalEvaluators 클래스를 사용하는 간단한 예제입니다.

from krag.document import KragDocument
from krag.evaluators import OfflineRetrievalEvaluators

# 각 쿼리에 대한 정답 문서 
actual_docs = [
    #  Query 1
    [
        KragDocument(metadata={'id': 1}, page_content='1'),
        KragDocument(metadata={'id': 2}, page_content='2'),
        KragDocument(metadata={'id': 3}, page_content='3'),
    ],
    #  Query 2
    [
        KragDocument(metadata={'id': 4}, page_content='4'),
        KragDocument(metadata={'id': 5}, page_content='5'),
        KragDocument(metadata={'id': 6}, page_content='6'),
    ],
    #  Query 3
    [
        KragDocument(metadata={'id': 7}, page_content='7'),
        KragDocument(metadata={'id': 8}, page_content='8'),
        KragDocument(metadata={'id': 9}, page_content='9'),
    ],
]


# 각 쿼리에 대한 검색 결과 
predicted_docs = [
    #  Query 1
    [
        KragDocument(metadata={'id': 1}, page_content='1'),
        KragDocument(metadata={'id': 4}, page_content='4'),
        KragDocument(metadata={'id': 7}, page_content='7'),
        KragDocument(metadata={'id': 2}, page_content='2'),
        KragDocument(metadata={'id': 5}, page_content='5'),
        KragDocument(metadata={'id': 8}, page_content='8'),
        KragDocument(metadata={'id': 3}, page_content='3'),
        KragDocument(metadata={'id': 6}, page_content='6'),
        KragDocument(metadata={'id': 9}, page_content='9')
    ],

    #  Query 2
    [
        KragDocument(metadata={'id': 4}, page_content='4'),
        KragDocument(metadata={'id': 1}, page_content='1'),
        KragDocument(metadata={'id': 7}, page_content='7'),
        KragDocument(metadata={'id': 5}, page_content='5'),
        KragDocument(metadata={'id': 2}, page_content='2'),
        KragDocument(metadata={'id': 8}, page_content='8'),
        KragDocument(metadata={'id': 6}, page_content='6'),
        KragDocument(metadata={'id': 3}, page_content='3'),
        KragDocument(metadata={'id': 9}, page_content='9')
    ],
    
    #  Query 3
    [
        KragDocument(metadata={'id': 7}, page_content='7'),
        KragDocument(metadata={'id': 2}, page_content='2'),
        KragDocument(metadata={'id': 4}, page_content='4'),
        KragDocument(metadata={'id': 8}, page_content='8'),
        KragDocument(metadata={'id': 5}, page_content='5'),
        KragDocument(metadata={'id': 3}, page_content='3'),
        KragDocument(metadata={'id': 9}, page_content='9'),
        KragDocument(metadata={'id': 6}, page_content='6'),
        KragDocument(metadata={'id': 1}, page_content='1')
    ]
]


# Initialize the evaluator / 평가도구 초기화 
evaluator = OfflineRetrievalEvaluators(actual_docs, predicted_docs, match_method="rouge1", threshold=0.8)

# Calculate evaluation metrics / 평가지표 계산 
hit_rate = evaluator.calculate_hit_rate()
mrr = evaluator.calculate_mrr()
recall_at_3 = evaluator.calculate_recall_k(k=3)
precision_at_5 = evaluator.calculate_precision_k(k=5)
map_at_5 = evaluator.calculate_map_k(k=5)
ndcg_at_5 = evaluator.ndcg_at_k(k=5)

# Print results / 결과 출력
print(f"Hit Rate: {hit_rate}")
print(f"MRR: {mrr}")
print(f"Recall@3: {recall_at_3}")
print(f"Precision@5: {precision_at_5}")
print(f"MAP@5: {map_at_5}")
print(f"NDCG@5: {ndcg_at_5}")

Key Features / 주요 기능

  1. Document Matching (문서 매칭):

    • The evaluator provides multiple methods to match actual and predicted documents, including exact text match and ROUGE-based matching (rouge1, rouge2, rougeL).
    • 평가자는 실제 문서와 예측된 문서를 매칭하기 위한 여러 가지 방법을 제공합니다. 여기에는 정확한 텍스트 매칭과 ROUGE 기반 매칭(rouge1, rouge2, rougeL)이 포함됩니다.
  2. Evaluation Metrics (평가지표):

    • Hit Rate: Measures the proportion of actual documents correctly identified in the predicted set.
    • Hit Rate (적중률): 예측된 문서 집합에서 실제 문서가 올바르게 식별된 비율을 측정합니다.
    • Recall@k: Evaluates how many relevant documents are present in the top-k predictions.
    • Recall@k: 상위 k개의 예측에서 얼마나 많은 관련 문서가 포함되었는지를 평가합니다.
    • Precision@k: Evaluates the precision of the top-k predictions.
    • Precision@k: 상위 k개의 예측의 정밀도를 평가합니다.
    • MRR (Mean Reciprocal Rank): Averages the reciprocal of the rank of the first relevant document.
    • MRR (Mean Reciprocal Rank, 평균 역순위): 첫 번째 관련 문서의 순위의 역수를 평균내어 계산합니다.
    • MAP@k (Mean Average Precision at k): Averages precision across top-k ranks where relevant documents appear.
    • MAP@k (Mean Average Precision at k): 상위 k위 안에 관련 문서가 등장하는 순위에서의 정밀도를 평균냅니다.
    • NDCG@k (Normalized Discounted Cumulative Gain at k): Evaluates the ranking quality considering the order of documents based on relevance scores, with softmax normalization applied when using ROUGE scores.
    • NDCG@k (Normalized Discounted Cumulative Gain at k): 관련성 점수를 바탕으로 문서 순서를 고려하여 순위 품질을 평가합니다. ROUGE 점수를 사용할 때 소프트맥스 정규화를 적용합니다.
  3. BM25 Integration (BM25 통합):

    • Seamlessly integrates with BM25 retrievers for scoring and evaluating document retrieval performance.
    • 문서 검색 성능을 평가하기 위해 BM25 리트리버와 원활하게 통합됩니다.

Using Custom Tokenizer with BM25 Retriever (BM25 리트리버와 커스텀 토크나이저 사용하기)

You can also use the KiWiBM25RetrieverWithScore class with a custom Kiwi tokenizer for advanced text retrieval. Here's an example:

고급 텍스트 검색을 위해 커스텀 Kiwi 토크나이저와 함께 KiWiBM25RetrieverWithScore 클래스를 사용할 수 있습니다. 아래는 예제입니다:

from krag.tokenizers import KiwiTokenizer
from krag.retrievers import KiWiBM25RetrieverWithScore

# Create a KiwiTokenizer with specific options
# 특정 옵션을 사용하여 KiwiTokenizer 생성
kiwi_tokenizer = KiwiTokenizer(model_type='knlm', typos='basic')

# Create the retriever instance with the custom tokenizer
# 커스텀 토크나이저로 리트리버 인스턴스 생성
retriever = KiWiBM25RetrieverWithScore(
    documents=langchain_docs, 
    kiwi_tokenizer=kiwi_tokenizer, 
    k=5, 
    threshold=0.0,
)

# Use the retriever
# 리트리버 사용
query = "K-RAG 패키지는 어떤 평가지표를 제공하나요?"

retrieved_docs = retriever.invoke(query, 2)

# Print retrieved documents with their BM25 scores
# 검색된 문서와 그 BM25 점수를 출력
for doc in retrieved_docs:
    print(doc.metadata.get('doc_id'))
    print(doc.metadata["bm25_score"])
   

 print(doc.page_content)
    print("------------------------------")

License

This project is licensed under the MIT License - see the MIT License for more details.

Contact

If you have any questions, feel free to reach out via email.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

krag-0.0.5-py3-none-any.whl (370.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page