Skip to main content

Korean-optimized RAG evaluation toolkit based on ranx with Kiwi tokenizer and Korean language support

Project description

ranx-k: Korean-optimized ranx IR Evaluation Toolkit ๐Ÿ‡ฐ๐Ÿ‡ท

PyPI version Python version License: MIT

ranx-k๋Š” ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋œ ์ •๋ณด ๊ฒ€์ƒ‰(IR) ํ‰๊ฐ€ ๋„๊ตฌ๋กœ, ๊ธฐ์กด ranx ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ™•์žฅํ•˜์—ฌ Kiwi ํ† ํฌ๋‚˜์ด์ €์™€ ํ•œ๊ตญ์–ด ์ž„๋ฒ ๋”ฉ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. RAG(Retrieval-Augmented Generation) ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ์ •ํ™•ํ•˜๊ฒŒ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿš€ ์ฃผ์š” ํŠน์ง•

  • ํ•œ๊ตญ์–ด ํŠนํ™”: Kiwi ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ํ™œ์šฉํ•œ ์ •ํ™•ํ•œ ํ† ํฐํ™”
  • ranx ๊ธฐ๋ฐ˜: ๊ฒ€์ฆ๋œ IR ํ‰๊ฐ€ ๋ฉ”ํŠธ๋ฆญ (Hit@K, NDCG@K, MRR ๋“ฑ) ์ง€์›
  • LangChain ํ˜ธํ™˜: LangChain ๊ฒ€์ƒ‰๊ธฐ ์ธํ„ฐํŽ˜์ด์Šค ํ‘œ์ค€ ์ง€์›
  • ๋‹ค์–‘ํ•œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•: ROUGE, ์ž„๋ฒ ๋”ฉ ์œ ์‚ฌ๋„, ์˜๋ฏธ์  ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ํ‰๊ฐ€
  • ์‹ค์šฉ์  ์„ค๊ณ„: ํ”„๋กœํ† ํƒ€์ž…๋ถ€ํ„ฐ ํ”„๋กœ๋•์…˜๊นŒ์ง€ ๋‹จ๊ณ„๋ณ„ ํ‰๊ฐ€ ์ง€์›
  • ๋†’์€ ์„ฑ๋Šฅ: ๊ธฐ์กด ๋ฐฉ๋ฒ• ๋Œ€๋น„ 30~80% ํ•œ๊ตญ์–ด ํ‰๊ฐ€ ์ •ํ™•๋„ ํ–ฅ์ƒ
  • ์ด์ค‘์–ธ์–ด ์ถœ๋ ฅ: ๊ตญ์ œ์  ์ ‘๊ทผ์„ฑ์„ ์œ„ํ•œ ์˜์–ด-ํ•œ๊ตญ์–ด ๋ณ‘๊ธฐ ์ถœ๋ ฅ ์ง€์›

๐Ÿ“ฆ ์„ค์น˜

pip install ranx-k

๋˜๋Š” ๊ฐœ๋ฐœ ๋ฒ„์ „ ์„ค์น˜:

pip install "ranx-k[dev]"

๐Ÿ”— ๊ฒ€์ƒ‰๊ธฐ ํ˜ธํ™˜์„ฑ

ranx-k๋Š” LangChain ๊ฒ€์ƒ‰๊ธฐ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:

# ๊ฒ€์ƒ‰๊ธฐ๋Š” invoke() ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค
class YourRetriever:
    def invoke(self, query: str) -> List[Document]:
        # Document ๊ฐ์ฒด ๋ฆฌ์ŠคํŠธ ๋ฐ˜ํ™˜ (page_content ์†์„ฑ ํ•„์š”)
        pass

# LangChain Document ์‚ฌ์šฉ ์˜ˆ์‹œ
from langchain.schema import Document
doc = Document(page_content="ํ…์ŠคํŠธ ๋‚ด์šฉ")

์ฐธ๊ณ : LangChain์€ MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๐Ÿ”ง ๋น ๋ฅธ ์‹œ์ž‘

๊ธฐ๋ณธ ์‚ฌ์šฉ๋ฒ•

from ranx_k.evaluation import simple_kiwi_rouge_evaluation

# ๊ฐ„๋‹จํ•œ Kiwi ROUGE ํ‰๊ฐ€
results = simple_kiwi_rouge_evaluation(
    retriever=your_retriever,
    questions=your_questions,
    reference_contexts=your_reference_contexts,
    k=5
)

print(f"ROUGE-1: {results['kiwi_rouge1@5']:.3f}")
print(f"ROUGE-2: {results['kiwi_rouge2@5']:.3f}")
print(f"ROUGE-L: {results['kiwi_rougeL@5']:.3f}")

ํ–ฅ์ƒ๋œ ํ‰๊ฐ€ (Rouge Score + Kiwi)

from ranx_k.evaluation import rouge_kiwi_enhanced_evaluation

# ๊ฒ€์ฆ๋œ rouge_score ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ + Kiwi ํ† ํฌ๋‚˜์ด์ €
results = rouge_kiwi_enhanced_evaluation(
    retriever=your_retriever,
    questions=your_questions,
    reference_contexts=your_reference_contexts,
    k=5,
    tokenize_method='morphs',  # 'morphs' ๋˜๋Š” 'nouns'
    use_stopwords=True
)

์˜๋ฏธ์  ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ranx ํ‰๊ฐ€

from ranx_k.evaluation import evaluate_with_ranx_similarity

# ์˜๋ฏธ์  ์œ ์‚ฌ๋„๋ฅผ ranx ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜
results = evaluate_with_ranx_similarity(
    retriever=your_retriever,
    questions=your_questions,
    reference_contexts=your_reference_contexts,
    k=5,
    method='kiwi_rouge',  # 'embedding', 'kiwi_rouge'
    similarity_threshold=0.6
)

print(f"Hit@5: {results['hit_rate@5']:.3f}")
print(f"NDCG@5: {results['ndcg@5']:.3f}")
print(f"MRR: {results['mrr']:.3f}")

์ข…ํ•ฉ ํ‰๊ฐ€

from ranx_k.evaluation import comprehensive_evaluation_comparison

# ๋ชจ๋“  ํ‰๊ฐ€ ๋ฐฉ๋ฒ• ๋น„๊ต
comparison = comprehensive_evaluation_comparison(
    retriever=your_retriever,
    questions=your_questions,
    reference_contexts=your_reference_contexts,
    k=5
)

๐Ÿ“Š ํ‰๊ฐ€ ๋ฐฉ๋ฒ•

1. Kiwi ROUGE ํ‰๊ฐ€

  • ์žฅ์ : ๋น ๋ฅธ ์†๋„, ์ง๊ด€์  ํ•ด์„
  • ์šฉ๋„: ํ”„๋กœํ† ํƒ€์ดํ•‘, ๋น ๋ฅธ ํ”ผ๋“œ๋ฐฑ

2. Enhanced ROUGE (Rouge Score + Kiwi)

  • ์žฅ์ : ๊ฒ€์ฆ๋œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ, ์•ˆ์ •์„ฑ
  • ์šฉ๋„: ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ, ์‹ ๋ขฐ์„ฑ ์ค‘์š”ํ•œ ํ‰๊ฐ€

3. ์˜๋ฏธ์  ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ranx

  • ์žฅ์ : ์ „ํ†ต์  IR ๋ฉ”ํŠธ๋ฆญ, ์˜๋ฏธ์  ์œ ์‚ฌ๋„
  • ์šฉ๋„: ์—ฐ๊ตฌ, ๋ฒค์น˜๋งˆํ‚น, ์ƒ์„ธ ๋ถ„์„

๐ŸŽฏ ์„ฑ๋Šฅ ๊ฐœ์„  ์‚ฌ๋ก€

# ๊ธฐ์กด ๋ฐฉ๋ฒ• (์˜์–ด ํ† ํฌ๋‚˜์ด์ €)
basic_rouge1 = 0.234

# ranx-k (Kiwi ํ† ํฌ๋‚˜์ด์ €)
ranxk_rouge1 = 0.421  # +79.9% ํ–ฅ์ƒ!

๐Ÿ“ˆ ์ ์ˆ˜ ํ•ด์„ ๊ฐ€์ด๋“œ

์ ์ˆ˜ ๋ฒ”์œ„ ํ‰๊ฐ€ ๊ถŒ์žฅ ์กฐ์น˜
0.7 ์ด์ƒ ๐ŸŸข ๋งค์šฐ ์ข‹์Œ ํ˜„์žฌ ์„ค์ • ์œ ์ง€
0.5~0.7 ๐ŸŸก ์–‘ํ˜ธ ๋ฏธ์„ธ ์กฐ์ • ๊ณ ๋ ค
0.3~0.5 ๐ŸŸ  ๋ณดํ†ต ๊ฐœ์„  ํ•„์š”
0.3 ๋ฏธ๋งŒ ๐Ÿ”ด ๋‚ฎ์Œ ์‹œ์Šคํ…œ ์žฌ๊ฒ€ํ† 

๐Ÿ“š ๋ฌธ์„œํ™”

์ž์„ธํ•œ ์‚ฌ์šฉ๋ฒ•๊ณผ ์˜ˆ์ œ๋Š” GitHub ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๐Ÿค ๊ธฐ์—ฌํ•˜๊ธฐ

ranx-k๋Š” ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ๊ธฐ์—ฌ๋ฅผ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค!

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“„ ๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ๋Š” MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ LICENSE ํŒŒ์ผ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ผ์ด์„ ์Šค ๋ฐ ์ €์ž‘๊ถŒ

์ด ํ”„๋กœ์ ํŠธ๋Š” ๋‹ค์Œ ์˜คํ”ˆ์†Œ์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐœ๋ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค:

  • rouge_score: Copyright (c) 2022 The rouge_score Authors (Apache License 2.0)
  • ranx: Copyright (c) 2021 Elias Bassani (MIT License)
  • kiwipiepy: Copyright (c) 2021 bab2min (LGPL v3.0)
  • ์ˆ˜์ • ๋ฐ ํ™•์žฅ: Copyright (c) 2025 Pandas Studio (MIT License)

๐Ÿ™ ๊ฐ์‚ฌ์˜ ๋ง

  • ranx: ๋›ฐ์–ด๋‚œ IR ํ‰๊ฐ€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ œ๊ณตํ•ด์ฃผ์‹  Elias Bassani๋‹˜
  • Kiwi: ๋›ฐ์–ด๋‚œ ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ์ œ๊ณตํ•ด์ฃผ์‹  bab2min๋‹˜
  • rouge_score: Google ResearchํŒ€์˜ ROUGE ๊ตฌํ˜„

๐Ÿ“ž ์ง€์›


ranx-k์™€ ํ•จ๊ป˜ ๋” ์ •ํ™•ํ•œ ํ•œ๊ตญ์–ด IR ํ‰๊ฐ€๋ฅผ ๊ฒฝํ—˜ํ•ด๋ณด์„ธ์š”! ๐Ÿš€๐Ÿ‡ฐ๐Ÿ‡ท

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ranx_k-0.0.3.tar.gz (43.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ranx_k-0.0.3-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file ranx_k-0.0.3.tar.gz.

File metadata

  • Download URL: ranx_k-0.0.3.tar.gz
  • Upload date:
  • Size: 43.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ranx_k-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f9ece415f290d4495d3d1ca370f8838f646398d8a0e4c48179af6c50c99d4c0b
MD5 0f27ec99f78a5632d47f6d12d2fa0d70
BLAKE2b-256 b5a77692623db030bd321a503afacf8dd14a08e197373d0e2325d328a3b57abf

See more details on using hashes here.

File details

Details for the file ranx_k-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: ranx_k-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ranx_k-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1194fb7bd495b2666caa637c6d6cf3a142728c70f18e69f54ce39cef73731fd3
MD5 9b354af67722c172b89b1349bd2e73da
BLAKE2b-256 3ce7e2a7e67c4af01ad567f1dd0642aedc5fba20967987106a30a5bdac41c082

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page