Skip to main content

UniDic packaged for Python

Project description

unidic-py

This is a version of UniDic packaged for use with pip.

Currently it supports 2.3.0, the latest version of UniDic. Note this will take up 1GB on disk after install and can take a long time to download. If you want a small package, try unidic-lite.

After installing via pip, you need to download the dictionary using the following command:

python -m unidic download

Example use with fugashi, though mecab-python3 works the same way:

import fugashi
import unidic
tagger = fugashi.Tagger('-d{}'.format(unidic.DICDIR))
# that's it!

Differences from the Official UniDic Release

This has a few changes from the official UniDic release to make it easier to use.

  • entries for 令和 have been added
  • single-character numeric and alphabetic words have been deleted
  • unk.def has been modified so unknown punctuation won't be marked as a noun

License

The modern Japanese UniDic is available under the GPL, LGPL, or BSD license, see here. UniDic is developed by NINJAL, the National Institute for Japanese Language and Linguistics. UniDic is copyrighted by the UniDic Consortium and is distributed here under the terms of the BSD License.

The code in this repository is not written or maintained by NINJAL. The code is available under the MIT or WTFPL License, as you prefer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unidic-1.0.2.tar.gz (5.0 kB view details)

Uploaded Source

File details

Details for the file unidic-1.0.2.tar.gz.

File metadata

  • Download URL: unidic-1.0.2.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.5

File hashes

Hashes for unidic-1.0.2.tar.gz
Algorithm Hash digest
SHA256 7f67f3e749eeaf2eb83c73fcafd7b6479d2180a9c9e1b740e2ac076da3497f36
MD5 ec18c7eddd275b9f2623cf2eb98e05c5
BLAKE2b-256 73f0f01bbde6d1ced17a53b6021ee3bdf5779c49ba7d13cc39e1c03cecbe262a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page