Skip to main content

Python wrapper for the MeCab morphological analyzer for Japanese

Project description

This is a Python wrapper for the MeCab morphological analyzer for Japanese text. It currently works with Python 3.8 and greater.

Note: If using MacOS Big Sur, you'll need to upgrade pip to version 20.3 or higher to use wheels due to a pip issue.

issueを英語で書く必要はありません。

Note that Windows wheels require a Microsoft Visual C++ Redistributable, so be sure to install that.

Basic usage

>>> import MeCab
>>> wakati = MeCab.Tagger("-Owakati")
>>> wakati.parse("pythonが大好きです").split()
['python', 'が', '大好き', 'です']

>>> tagger = MeCab.Tagger()
>>> print(tagger.parse("pythonが大好きです"))
python  python  python  python  名詞-普通名詞-一般
                        助詞-格助詞
大好き  ダイスキ        ダイスキ        大好き  形状詞-一般
です    デス    デス    です    助動詞  助動詞-デス     終止形-一般
EOS

The API for mecab-python3 closely follows the API for MeCab itself, even when this makes it not very “Pythonic.” Please consult the official MeCab documentation for more information.

Installation

Binary wheels are available for MacOS X, Linux, and Windows (64bit) are installed by default when you use pip:

pip install mecab-python3

These wheels include a copy of the MeCab library, but not a dictionary. In order to use MeCab you'll need to install a dictionary. unidic-lite is a good one to start with:

pip install unidic-lite

To build from source using pip,

pip install --no-binary :all: mecab-python3

Dictionaries

In order to use MeCab, you must install a dictionary. There are many different dictionaries available for MeCab. These UniDic packages, which include slight modifications for ease of use, are recommended:

  • unidic: The latest full UniDic.
  • unidic-lite: A slightly modified UniDic 2.1.2, chosen for its small size.

The dictionaries below are not recommended due to being unmaintained for many years, but they are available for use with legacy applications.

For more details on the differences between dictionaries see here.

Common Issues

If you get a RuntimeError when you try to run MeCab, here are some things to check:

Windows Redistributable

You have to install this to use this package on Windows.

Installing a Dictionary

Run pip install unidic-lite and confirm that works. If that fixes your problem, you either don't have a dictionary installed, or you need to specify your dictionary path like this:

tagger = MeCab.Tagger('-r /dev/null -d /usr/local/lib/mecab/dic/mydic')

Note: on Windows, use nul instead of /dev/null. Alternately, if you have a mecabrc you can use the path after -r.

Specifying a mecabrc

If you get this error:

error message: [ifs] no such file or directory: /usr/local/etc/mecabrc

You need to specify a mecabrc file. It's OK to specify an empty file, it just has to exist. You can specify a mecabrc with -r. This may be necessary on Debian or Ubuntu, where the mecabrc is in /etc/mecabrc.

You can specify an empty mecabrc like this:

tagger = MeCab.Tagger('-r/dev/null -d/home/hoge/mydic')

Using Unsupported Output Modes like -Ochasen

Chasen output is not a built-in feature of MeCab, you must specify it in your dicrc or mecabrc. Notably, Unidic does not include Chasen output format. Please see the MeCab documentation.

Alternatives

  • fugashi is a Cython wrapper for MeCab with a Pythonic interface, by the current maintainer of this library
  • SudachiPy is a modern tokenizer with an actively maintained dictionary
  • pymecab-ko is a wrapper of the Korean MeCab fork mecab-ko based on mecab-python3
  • KoNLPy is a library for Korean NLP that includes a MeCab wrapper

Licensing

Like MeCab itself, mecab-python3 is copyrighted free software by Taku Kudo taku@chasen.org and Nippon Telegraph and Telephone Corporation, and is distributed under a 3-clause BSD license (see the file BSD). Alternatively, it may be redistributed under the terms of the GNU General Public License, version 2 (see the file GPL) or the GNU Lesser General Public License, version 2.1 (see the file LGPL).

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mecab_python3-1.0.9.dev2-cp312-cp312-win_amd64.whl (501.6 kB view details)

Uploaded CPython 3.12Windows x86-64

mecab_python3-1.0.9.dev2-cp311-cp311-win_amd64.whl (500.9 kB view details)

Uploaded CPython 3.11Windows x86-64

mecab_python3-1.0.9.dev2-cp310-cp310-win_amd64.whl (500.9 kB view details)

Uploaded CPython 3.10Windows x86-64

mecab_python3-1.0.9.dev2-cp39-cp39-win_amd64.whl (501.0 kB view details)

Uploaded CPython 3.9Windows x86-64

mecab_python3-1.0.9.dev2-cp38-cp38-win_amd64.whl (501.2 kB view details)

Uploaded CPython 3.8Windows x86-64

File details

Details for the file mecab_python3-1.0.9.dev2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for mecab_python3-1.0.9.dev2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 526bb3b086c2051665a2cdaea4becef2675948f3be53d4654afa03f632f05050
MD5 87e4cc293e89fea885b2748437ccf32f
BLAKE2b-256 8a764663b86b0f7f17ba611a8b8498d6be5a98862d49f38cca3a2a364736c05b

See more details on using hashes here.

File details

Details for the file mecab_python3-1.0.9.dev2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for mecab_python3-1.0.9.dev2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 cbeff090b02656d5c820f9113631fba178468fcea963266c39bce45759d33b87
MD5 4d4c80445a8dee9d435939777fb8ee87
BLAKE2b-256 689e838fda1cbb47f221039cddbf639201d29d5a8688785c7d0f198446934148

See more details on using hashes here.

File details

Details for the file mecab_python3-1.0.9.dev2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for mecab_python3-1.0.9.dev2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5e2e66379a95edd3bf66906b73e23319d4ffa3977be70274e3d4d987550500de
MD5 4f25913082169cd363775ee3693305ad
BLAKE2b-256 b46043a206ff6ccd07da0f1ae2bdc1d9e3d8ccec55d4d44a8b81c75c77eebd5d

See more details on using hashes here.

File details

Details for the file mecab_python3-1.0.9.dev2-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for mecab_python3-1.0.9.dev2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f8b6c0645fb654e5fbe92439f08cf341ddccf81dd244f3148bf5f49ea1b955bf
MD5 c43e932c611e088534d0ca5c7486821c
BLAKE2b-256 9d9bee7c0b9aa79280d03383b917fbac28b1f9642cb7093fc6b694582bab5919

See more details on using hashes here.

File details

Details for the file mecab_python3-1.0.9.dev2-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for mecab_python3-1.0.9.dev2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e8fde75bf4e23dbcbfd3753831b74a94eedd097c3db936d2cd2fa8e098c41c61
MD5 80cfe19c2d912f582603106c2850e3be
BLAKE2b-256 d555b3db96d2f6f2c7e06fd96f3f125d3433d0651d2718bb4488a19d0bacc6c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page