Yet another Python binding for Juman++/KNP/KWJA
Project description
rhoknp: Yet another Python binding for Juman++/KNP/KWJA
rhoknp is a Python binding for Juman++, KNP, and KWJA.
import rhoknp
# Perform language analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
# Save language analysis by Juman++
with open("result.jumanpp", "wt") as f:
f.write(sentence.to_jumanpp())
# Load language analysis by Juman++
with open("result.jumanpp", "rt") as f:
sentence = rhoknp.Sentence.from_jumanpp(f.read())
# Perform language analysis by KNP
knp = rhoknp.KNP()
sentence = knp.apply_to_sentence(sentence) # or knp.apply_to_sentence("電気抵抗率は...")
# Save language analysis by KNP
with open("result.knp", "wt") as f:
f.write(sentence.to_knp())
# Load language analysis by KNP
with open("result.knp", "rt") as f:
sentence = rhoknp.Sentence.from_knp(f.read())
# Perform language analysis by KWJA
kwja = rhoknp.KWJA()
sentence = kwja.apply_to_sentence(sentence) # or kwja.apply_to_sentence("電気抵抗率は...")
Requirements
- Python 3.8+
Optional requirements for language analysis
Installation
pip install rhoknp
Documentation
https://rhoknp.readthedocs.io/en/latest/
Quick tour
rhoknp provides APIs to perform language analysis by Juman++ and KNP.
# Perform language analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
# Perform language analysis by KNP
knp = rhoknp.KNP()
sentence = knp.apply_to_sentence(sentence) # or knp.apply_to_sentence("電気抵抗率は...")
Sentence objects can be saved in the Juman/KNP format
# Save language analysis by Juman++
with open("result.jumanpp", "wt") as f:
f.write(sentence.to_jumanpp())
# Save language analysis by KNP
with open("result.knp", "wt") as f:
f.write(sentence.to_knp())
and recovered from Juman/KNP-format text.
# Load language analysis by Juman++
with open("result.jumanpp", "rt") as f:
sentence = rhoknp.Sentence.from_jumanpp(f.read())
# Perform language analysis by KNP
with open("result.knp", "rt") as f:
sentence = rhoknp.Sentence.from_knp(f.read())
It is easy to access the linguistic units that make up a sentence.
for clause in sentence.clauses:
...
for phrase in sentence.phrases: # a.k.a. bunsetsu
...
for base_phrase in sentence.base_phrases: # a.k.a. kihon-ku
...
for morpheme in sentence.morphemes:
...
rhoknp also provides APIs for document-level language analysis.
document = rhoknp.Document.from_raw_text(
"電気抵抗率は電気の通しにくさを表す物性値である。単に抵抗率とも呼ばれる。"
)
# If you know sentence boundaries, you can use `Document.from_sentences` instead.
document = rhoknp.Document.from_sentences(
[
"電気抵抗率は電気の通しにくさを表す物性値である。",
"単に抵抗率とも呼ばれる。",
]
)
Document objects can be handled in almost the same way as Sentence objects.
# Perform language analysis by Juman++/KNP
document = jumanpp.apply_to_document(document)
document = knp.apply_to_document(document)
# Save language analysis by Juman++/KNP
with open("result.jumanpp", "wt") as f:
f.write(document.to_jumanpp())
with open("result.knp", "wt") as f:
f.write(document.to_knp())
# Load language analysis by Juman++/KNP
with open("result.jumanpp", "rt") as f:
document = rhoknp.Document.from_jumanpp(f.read())
with open("result.knp", "rt") as f:
document = rhoknp.Document.from_knp(f.read())
# Access language units in the document
for sentence in document.sentences:
...
for clause in document.clauses:
...
for phrase in document.phrases:
...
for base_phrase in document.base_phrases:
...
for morpheme in document.morphemes:
...
For more information, explore the examples and documentation.
Main differences from pyknp
- Support for document-level language analysis: rhoknp can load and instantiate the result of document-level language analysis (i.e., cohesion analysis and discourse relation analysis).
- Strictly type-aware: rhoknp is thoroughly annotated with type annotations.
- Extensive test suite: rhoknp is tested with an extensive test suite. See the code coverage at Codecov.
License
MIT
Contributing
We welcome contributions to rhoknp. You can get started by reading the contribution guide.
Reference
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.