Japanese morphological analysis engine.
Project description
Janome is a Japanese morphological analysis engine written in pure Python.
General documentation:
https://mocobeta.github.io/janome/en/ (English)
https://mocobeta.github.io/janome/ (Japanese)
Requirements
Python 3.7+ is required.
Install
[Note] This consumes about 500 MB memory for building.
(venv) $ python setup.py install
Run
(env) $ python
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer()
>>> for token in t.tokenize('すもももももももものうち'):
... print(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
License
Licensed under Apache License 2.0 and uses the MeCab-IPADIC dictionary/statistical model.
See LICENSE.txt and NOTICE.txt for license details.
Acknowledgement
Special thanks to @ikawaha, @takuyaa, @nakagami and @janome_oekaki.
Copyright
Copyright(C) 2022, Tomoko Uchida. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Janome-0.4.2.tar.gz
(18.8 MB
view hashes)
Built Distribution
Janome-0.4.2-py2.py3-none-any.whl
(19.7 MB
view hashes)
Close
Hashes for Janome-0.4.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d292e458aa8a72b4aae51ae1e0cd91bf463beb95ad1ec118be7aab87eea1a454 |
|
MD5 | 45ed0009af565704c4cd07e2a581d6c3 |
|
BLAKE2b-256 | 2b0999a267382d699766366419773c2ef09e2dc4a033a4c48cc31b26bd509b04 |