yakinori is a tool for converting Kanji to hiragana, katakana, roma-ji.
Project description
yakinori
Japanese REAMED is here. TODO:公開されたREADME_jp.mdのURLを貼る
Japanese Converter Kanji to Hiragana, Katakana, Latin alphabet.
You can get the reading and pronunciation of Japanese sentences based on mecab-unidic-NEologd.
Test Environments
Ubuntu18.04
python==3.8.16
Install
There are two options to install.
- Install Mecab and mecab-unidic-NEologd in your own environment
- Use Docker
Your Own Environment
For Ubuntu
Install Mecab
$ sudo apt update
$ sudo apt install mecab libmecab-dev mecab-ipadic-utf8
Install mecab-unidic-NEologd
$ git clone --depth 1 https://github.com/neologd/mecab-unidic-neologd.git
$ cd mecab-unidic-neologd
$ sudo ./bin/install-mecab-unidic-neologd -n -y
# show installed mecab-unidic-NEologd dictionary path
$ echo `mecab-config --dicdir`"/mecab-unidic-neologd"
> /usr/local/lib/mecab/dic/mecab-unidic-neologd
# If you want to make mecab-unidic-NEologd as defalut dictionary, run commands below.
$ echo "dicdir = `mecab-config --dicdir`/mecab-unidic-neologd" | sudo tee /etc/mecabrc
$ sudo cp /etc/mecabrc /usr/local/etc
Install yakinori
$ pip install git+ssh://git@github.com/morikatron/yakinori.git
# $ pip install yakinori # TODO:PyPIで公開したらこちらにする
You can update the recent mecab-unidic-NEologd
$ sudo ./bin/install-mecab-unidic-neologd -n -y
$ echo "dicdir = `mecab-config --dicdir`/mecab-unidic-neologd" | sudo tee /etc/mecabrc
$ sudo cp /etc/mecabrc /usr/local/etc
Use Docker
$ docker image build --network host -t yakinori .
$ docker run -it --name yakinori yakinori /bin/bash
# TODO: docker hubで公開したらこちらにする
How to use
Import
>>> from yakinori import Yakinori
create Instance
Installed on your Own Environment
- If you made mecab-unidic-NEologd as defalut dictionary, you don't need to add dic_path.
>>> yakinori = Yakinori()
- If you did not make mecab-unidic-NEologd as defalut dictionary, add dic_path.
>>> yakinori = Yakinori(dic_path='path/to/mecab-unidic-NEologd')
Using Docker
If you use Docker, you don't need to add dic_path.
>>> yakinori = Yakinori()
Parse Sentence
>>> sentence = "幽☆遊☆白書は最高の漫画です"
>>> parsed_list = yakinori.get_parsed_list(sentence)
Get Reading
# convert to hiragana
>>> hiragana_sentence = yakinori.get_hiragana_sentence(parsed_list)
>>> print(hiragana_sentence)
ゆうゆうはくしょはさいこうのまんがです
# convert to katakana
>>> katakana_sentence = yakinori.get_katakana_sentence(parsed_list)
>>> print(katakana_sentence)
ユウユウハクショハサイコウノマンガデス
# convert to Latin alphabet
>>> roma_sentence = yakinori.get_roma_sentence(parsed_list)
>>> print(roma_sentence)
yuuyuuhakushohasaikounomangadesu
Get Pronunciation
# convert to hiragana
>>> hiragana_sentence = yakinori.get_hiragana_sentence(parsed_list, is_hatsuon=True)
>>> print(hiragana_sentence)
ゆーゆーはくしょわさいこーのまんがです
# convert to katakana
>>> katakana_sentence = yakinori.get_katakana_sentence(parsed_list, is_hatsuon=True)
>>> print(katakana_sentence)
ユーユーハクショワサイコーノマンガデス
# convert to Latin alphabet
>>> roma_sentence = yakinori.get_roma_sentence(parsed_list, is_hatsuon=True)
>>> print(roma_sentence)
yuーyuーhakushowasaikoーnomangadesu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
yakinori-0.1.0.tar.gz
(4.8 kB
view hashes)