Japanese morphological analysis engine.
Project description
janome is a Japanese morphological analysis engine written in pure Python.
General documentation: http://mocobeta.github.io/janome/ (for Japanese)
Requirements
Python 2.7.x or 3.4+ is required.
Install
[Note] This consumes about 500 MB memory for building.
(venv) $ python setup.py install
Run
(env) $ python
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer()
>>> for token in t.tokenize(u'すもももももももものうち'):
... print(token)
...
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
License
Licensed under Apache License 2.0 and uses the MeCab-IPADIC dictionary/statistical model.
See LICENSE.txt and NOTICE.txt for license details.
Acknowledgement
Special thanks to @ikawaha and @takuya_a.
Copyright
Copyright(C) 2015, moco_beta. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Janome-0.2.7.tar.gz
(13.6 MB
view details)
File details
Details for the file Janome-0.2.7.tar.gz.
File metadata
- Download URL: Janome-0.2.7.tar.gz
- Upload date:
- Size: 13.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1d62004f503698801c01069f7c2d2c890eb632f2da0aea2cbf90b056abffdc3
|
|
| MD5 |
9554dea7850682675d4358016e64006f
|
|
| BLAKE2b-256 |
1ef3ab90fea3333a8ab4f062b8315a24c0159fca9ce5791ce4a0f38e527f018e
|