Skip to main content

UniDic packaged for Python

Project description

unidic-py

This is a version of UniDic packaged for use with pip.

This is based on UniDic 2.1.2, which is roughly 55MB zipped or 300MB unzipped. There are more recent versions of UniDic but they're significantly larger, which makes packaging difficult.

This package distributes only the files necessary for using UniDic with MeCab. The large files are gzipped for distribution and unzipped the first time the library is imported. It would be better for MeCab to unzip on the fly when reading from disk but it doesn't support that.

Example use with fugashi, though mecab-python3 works the same way:

import fugashi
import unidic
tagger = fugashi.Tagger('-d{}'.format(unidic.DICDIR))
# that's it!

License

The modern Japanese UniDic is available under the GPL, LGPL, or BSD license, see here or the included BSD license. UniDic is developed by NINJAL, the National Institute for Japanese Language and Linguistics.

The code in this repository is not written or maintained by NINJAL. The code is available under the MIT or WTFPL License, as you prefer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unidic-0.0.1.tar.gz (56.1 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page