Skip to main content

Simple tool to translate from Roma-ji into Hiragana.

Project description

pyokaka

Simple tool for converting Roma-ji sentences into Hiragana one.
ローマ字の文を平仮名に変換するシンプルなツールです。

The origin of package name

An homage to pykakashi that provides highly function to convert Kana-Kanji into Roma-ji. Okaka(おかか) is easy Japanese word that mean bonito flakes.

Demo

As command line tool
Use as REPL just by calling from terminal. To quit, send EOF.

$ python -m pyokaka.okaka

Roman >>> ohayougozaimasu
JKana ... おはようございます
Roman >>> kon'nichiwa
JKana ... こんにちわ
Roman >>> oyasuminasai
JKana ... おやすみなさい

You can indicate file you want to convert.

$ cat sample.txt
Ima wa mukashi, taketori no okina to iu mono ari keri.

$ python -m pyokaka.okaka sample.txt
いま わ むかし, たけとり の おきな と いう もの あり けり.

To apply additional rule, load utf-8 encoded json file.

{
    "ら": ["la"], "り": ["li"], "る": ["lu"], "れ": ["le"], "ろ": ["lo"],
    "ふぁ": ["pha", "hua"], "ふぃ": ["phi"]
}
$ cat sample.txt
elephant
lalallalalla

$ python -m pyokaka.okaka sample.txt
えlえpはんt
lあlあllあlあllあ

$ python -m pyokaka.okaka sample.txt --load sample.json
load for sample.json...
えれふぁんt
ららっららっら

For more information, view python -m pyokaka.okaka --help.

As library

>>> from pyokaka import okaka
>>> okaka.convert('katsuobushi')
'かつおぶし'

You can add more vocabulary as described below.

>>> okaka.convert('philipps')
'pひlいpps'
>>>
>>> okaka.update_convert_dct({
...     'p': 'ぷ', 's': 'す'
... })
>>>
>>> okaka.convert('philips')
'ぷひlいぷす'
>>>
>>> import json
>>> with open('sample.json', encoding='utf-8') as fin:
...     table = json.load(fin)
...
>>> okaka.update_transtable(table)
>>> okaka.convert('philips')
'ふぃりぷす'

Notes

  • You cannot reset convert table without restart.

  • Though converter ignores letter what can be not interpret as a part of Roma-ji, remaining letters always be converted.

    $ python -m pyokaka.okaka
    Roman >>> Oh dear, this is English!
    JKana ... おh であr, tひs いs えんglいsh!
    
  • Hyphen always be replaced with Cho'onpu.

    $ python -m pyokaka.okaka
    Roman >>> Roma-ji
    JKana ... ろまーじ
    
  • Converter never analyze sentence structure. So it cannot recognize 'wa', 'o' and 'e' as postpositional particle.

    $ python -m pyokaka.okaka
    Roman >>> Watashi wa depa-to e enpitsu o kai ni ikimashita.
    JKana ... わたし わ でぱーと え えんぴつ お かい に いきました.
    
  • Conversion is based on greedy algorithm. Single quote can be used as separater if you need.

    Roman >>> honya
    JKana ... ほにゃ
    Roman >>> honnya
    JKana ... ほっにゃ
    Roman >>> honnnya
    JKana ... ほんにゃ
    
    Roman >>> hon'ya
    JKana ... ほんや
    

Install

This module is registered at PyPI. PyPI - pyokaka

$ pip install pyokaka

License

MIT

Author

LouiS0616

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pyokaka-1.0.0-py3-none-any.whl (7.3 kB) Copy SHA256 hash SHA256 Wheel py3
pyokaka-1.0.0.tar.gz (5.8 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page