Skip to main content

了解 — unified semantic MT evaluation: MEANT, XMEANT, YiSi, WOLVESAAR, and SimAlign-style word alignment over modern multilingual embeddings.

Project description

ryokai 了解

Ryokai (了解, "understood / got it") — a unified Python library for semantic machine-translation evaluation, combining the strengths of MEANT 2.0, XMEANT, YiSi-1/2, WOLVESAAR, and SimAlign behind one clean API on top of modern multilingual embeddings.

Pure PyTorch + HuggingFace transformers — no Stanza, no spaCy, no external parsers. Two HF models cover all 13 supported languages (en, de, fr, es, cs, fi, hi, lv, pl, ro, ru, tr, zh) in a single install:

Both are one-line swappable for any modern multilingual encoder (Qwen3-Embedding, Jina v3, BGE-M3, Nemotron-8B…) — see Embedding backbones in DOCUMENTATION.md.

Install

pip install ryokai

Quickstart

from ryokai import Ryokai

scorer = Ryokai()
src_lang, tgt_lang = "en", "ja"

# Most common: reference-free, word alignment + embedding
# (XMEANT-lite / YiSi-2 / Doc-embedding adequacy cross-lingual)
scorer.score(source=src, hypothesis=hyp,
             source_lang=src_lang, target_lang=tgt_lang)

Variants

One .score() call, four modes, dispatched by which arguments you pass. srl=False is the default — ryokai is no longer MEANT-first.

from ryokai import Ryokai
scorer = Ryokai()
src_lang, tgt_lang = "en", "ja"

# Reference-free, word alignment + embedding (default, most common)
# E.g. Doc-embedding adequacy / YiSi-2 / XMEANT-lite
scorer.score(source=src, hypothesis=hyp,
             source_lang=src_lang, target_lang=tgt_lang)

# Reference-based, word alignment + embedding
# E.g. Doc-embedding adequacy / WOLVESAAR / YiSi-1 / SimAlign style
scorer.score(reference=ref, hypothesis=hyp, target_lang=tgt_lang)

# Reference-free, frame-based — XMEANT proper
scorer.score(source=src, hypothesis=hyp,
             source_lang=src_lang, target_lang=tgt_lang, srl=True)

# Reference-based, frame-based — MEANT 2.0
scorer.score(reference=ref, hypothesis=hyp, target_lang=tgt_lang, srl=True)

See DOCUMENTATION.md for flags, aligner choices, embedding-backbone swaps, AER evaluation harness, CLI, architecture, and custom role weights.

References

Ryokai is glue around several published techniques — credit belongs to their authors.

Technique Year Citation Category
MEANT 2011 Lo & Wu. MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles. ACL 2011. Semantic-frame MT evaluation
XMEANT 2014 Lo, Beloucif, Saers & Wu. XMEANT: Better semantic MT evaluation without reference translations. ACL 2014 (Short Papers). Semantic-frame MT evaluation
MEANT 2.0 2017 Lo. MEANT 2.0: Accurate semantic MT evaluation for any output language. WMT 2017. Semantic-frame MT evaluation
Doc-embedding adequacy 2015 Vela & Tan. Predicting Machine Translation Adequacy with Document Embeddings. WMT 2015. Embedding-based MT evaluation
WOLVESAAR 2016 Bechara, Gupta, Tan, Orăsan, Mitkov & van Genabith. WOLVESAAR at SemEval-2016 Task 1: Replicating the Success of Monolingual Word Alignment and Neural Embeddings for Semantic Textual Similarity. SemEval-2016. Embedding-based MT evaluation
YiSi 2019 Lo. YiSi — a Unified Semantic MT Quality Evaluation and Estimation Metric for Languages with Different Levels of Available Resources. WMT 2019. Embedding-based MT evaluation
Monolingual aligner 2014 Sultan, Bethard & Sumner. Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence. TACL 2014. Word alignment
SimAlign 2020 Jalili Sabet, Dufter, Yvon & Schütze. SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings. Findings of EMNLP 2020. Word alignment

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ryokai-0.1.0.tar.gz (37.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ryokai-0.1.0-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file ryokai-0.1.0.tar.gz.

File metadata

  • Download URL: ryokai-0.1.0.tar.gz
  • Upload date:
  • Size: 37.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for ryokai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d2d155bc52b34bb1efbd2122d17026b75bc2e3cd7bccfa33d6bbf5faf3e4514c
MD5 855a555151cf243c811ba90efc21c89f
BLAKE2b-256 aba58bc0fee4d376c6e35e73a8553c53a51f5c9d63aefb00c31393c57a299c1a

See more details on using hashes here.

File details

Details for the file ryokai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ryokai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for ryokai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3e722143305b3b6319b3cafd280880af8640decb26b95584ff65921e065b263
MD5 c0d74b200baa69f30284897a87ca020d
BLAKE2b-256 95367f88a944e235c221203aaf0de5db020d918d7c95b400041b9e91ec31264f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page