Simple tool to translate from Roma-ji into Hiragana.
Project description
pyokaka
Simple tool for converting Roma-ji sentences into Hiragana one.
ローマ字の文を平仮名に変換するシンプルなツールです。
The origin of package name
An homage to pykakashi that provides highly function to convert Kana-Kanji into Roma-ji. Okaka(おかか) is easy Japanese word that mean bonito flakes.
Demo
As command line tool
Use as REPL just by calling from terminal. To quit, send EOF.
$ python -m pyokaka.okaka
Roman >>> ohayougozaimasu
JKana ... おはようございます
Roman >>> kon'nichiwa
JKana ... こんにちわ
Roman >>> oyasuminasai
JKana ... おやすみなさい
You can indicate file you want to convert.
$ cat sample.txt
Ima wa mukashi, taketori no okina to iu mono ari keri.
$ python -m pyokaka.okaka sample.txt
いま わ むかし, たけとり の おきな と いう もの あり けり.
To apply additional rule, load utf-8 encoded json file.
{
"ら": ["la"], "り": ["li"], "る": ["lu"], "れ": ["le"], "ろ": ["lo"],
"ふぁ": ["pha", "hua"], "ふぃ": ["phi"]
}
$ cat sample.txt
elephant
lalallalalla
$ python -m pyokaka.okaka sample.txt
えlえpはんt
lあlあllあlあllあ
$ python -m pyokaka.okaka sample.txt --load sample.json
load for sample.json...
えれふぁんt
ららっららっら
For more information, view python -m pyokaka.okaka --help
.
As library
>>> from pyokaka import okaka
>>> okaka.convert('katsuobushi')
'かつおぶし'
You can add more vocabulary as described below.
>>> okaka.convert('philipps')
'pひlいpps'
>>>
>>> okaka.update_convert_dct({
... 'p': 'ぷ', 's': 'す'
... })
>>>
>>> okaka.convert('philips')
'ぷひlいぷす'
>>>
>>> import json
>>> with open('sample.json', encoding='utf-8') as fin:
... table = json.load(fin)
...
>>> okaka.update_transtable(table)
>>> okaka.convert('philips')
'ふぃりぷす'
Notes
-
You cannot reset convert table without restart.
-
Though converter ignores letter what can be not interpret as a part of Roma-ji, remaining letters always be converted.
$ python -m pyokaka.okaka Roman >>> Oh dear, this is English! JKana ... おh であr, tひs いs えんglいsh!
-
Hyphen always be replaced with Cho'onpu.
$ python -m pyokaka.okaka Roman >>> Roma-ji JKana ... ろまーじ
-
Converter never analyze sentence structure. So it cannot recognize 'wa', 'o' and 'e' as postpositional particle.
$ python -m pyokaka.okaka Roman >>> Watashi wa depa-to e enpitsu o kai ni ikimashita. JKana ... わたし わ でぱーと え えんぴつ お かい に いきました.
-
Conversion is based on greedy algorithm. Single quote can be used as separater if you need.
Roman >>> honya JKana ... ほにゃ Roman >>> honnya JKana ... ほっにゃ Roman >>> honnnya JKana ... ほんにゃ Roman >>> hon'ya JKana ... ほんや
Install
This module is registered at PyPI. PyPI - pyokaka
$ pip install pyokaka
License
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.