Skip to main content

A Cython MeCab wrapper for fast, pythonic Japanese tokenization.

Project description

Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

See the blog post for background on why Fugashi exists and some of the design decisions.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

License and Copyright Notice

Fugashi is released under the terms of the MIT license. Please copy it far and wide.

Fugashi is a wrapper for MeCab, and Fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fugashi-1.0.2a9.tar.gz (334.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-1.0.2a9-cp38-cp38-win_amd64.whl (500.1 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-1.0.2a9-cp38-cp38-manylinux1_x86_64.whl (487.0 kB view details)

Uploaded CPython 3.8

fugashi-1.0.2a9-cp38-cp38-macosx_10_14_x86_64.whl (283.0 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-1.0.2a9-cp37-cp37m-win_amd64.whl (499.2 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-1.0.2a9-cp37-cp37m-manylinux1_x86_64.whl (477.1 kB view details)

Uploaded CPython 3.7m

fugashi-1.0.2a9-cp37-cp37m-macosx_10_14_x86_64.whl (282.3 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-1.0.2a9-cp36-cp36m-win_amd64.whl (499.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-1.0.2a9-cp36-cp36m-manylinux1_x86_64.whl (476.6 kB view details)

Uploaded CPython 3.6m

fugashi-1.0.2a9-cp36-cp36m-macosx_10_14_x86_64.whl (283.2 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-1.0.2a9-cp35-cp35m-win_amd64.whl (497.6 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-1.0.2a9-cp35-cp35m-manylinux1_x86_64.whl (473.2 kB view details)

Uploaded CPython 3.5m

fugashi-1.0.2a9-cp35-cp35m-macosx_10_14_x86_64.whl (281.0 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-1.0.2a9.tar.gz.

File metadata

  • Download URL: fugashi-1.0.2a9.tar.gz
  • Upload date:
  • Size: 334.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9.tar.gz
Algorithm Hash digest
SHA256 5e2ea3f11b59ca34baa6709e4fa2fb3739ecf414294371ed6ed04535f70466d9
MD5 3a36932914589cdcc15a5c0f6a5aac73
BLAKE2b-256 89caf2c5287c573554d147d8c58120c7b6ba4b68b41a8f7c3b7048aab2077c8b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 500.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c645af32adff03263aba2dd53760b0ef563ecb042097ee7aed9cf7bce1c5a9b2
MD5 cbcd079fec64422cbf328108968df7db
BLAKE2b-256 d7a8daaccd9bce3e6429f70cca27a1bf497f95f96d1540d6f750e3c7b1e1cb4e

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 487.0 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6536ab7435d67652ced55a5fa77623df5b52938a8b8856ba7f2f400d9c5dd31d
MD5 19f1f8ea1ce5e27e2fa94d1466c125af
BLAKE2b-256 62c520d67168e235d9639ba834217da2a495fb823e5b511c61e311b9bca87d7f

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.0 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 32ef13a83341fef486b78254176b63721bb4776b11f68752cf9bbea2b176dec2
MD5 92879aab8b9eec8ab1ed0b2bc57f49f0
BLAKE2b-256 187450a912a547ef9125ba36bcb8a916a2cfc2a6f60f41e881f8e6db95aa7f5a

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 499.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.2a9-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 89924d81e456b92b4011ffea39544c90ea81877aa2244992edfeffcaddbc099e
MD5 d323e756c5e509040b4dc0a58c9a078d
BLAKE2b-256 68ac6430182c98ed72d4d7ca0c3363074748f2213aa682e73966e3cf941f4d9c

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.1 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1834ce593245f742ca47ee106c7f4fa8bee3fbe71b40d027b13cb93f16a4c978
MD5 6b859817591d6fbc3a011a2d84ef03ee
BLAKE2b-256 52e812205d149f8ed1c06cf841a0f786764e5bcabf1c1b17a1b8939b2fb250db

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 282.3 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.2a9-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 33f7d22c98292b2fcaeb92d8f08f6570a972191ca76d147b1a8e98becf383a83
MD5 373bb5797cc15bf1a26ba30a9ef04b2f
BLAKE2b-256 b10152699765839d33b37572375e10d8d1a3a4f69344d21d35a449431501dcb0

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 499.0 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.8

File hashes

Hashes for fugashi-1.0.2a9-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 dd77dae52945e5be780c041299f54057298ad01917e3bfe171f15402a20eeea5
MD5 8391680f55d32b034d4d5ec3736c5dcd
BLAKE2b-256 97c0d531e2915c4b8dcde247677d9cf08ec6b39bc9273c965c43ee4297f8612a

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 476.6 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6963251cb72ee4bf929b0ce97ff7f2f08878268d0950262138fe717b8d095445
MD5 62783af6409a1ca18de6597cbde7f214
BLAKE2b-256 6a2032f69b3fa7e2c2e6b1ba11f508a61ec26e4bc159e97d9006fde8bea042fa

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.2 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for fugashi-1.0.2a9-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e4ed67f5ec4df358ede77aefa95a9f9f32a5307fe15945d28631075880de56e6
MD5 c58781342f6325121c43cd4cb111851b
BLAKE2b-256 b1550bbaef21467975a35bebedfeda0696200604463475066220c2aa59b971a0

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 497.6 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.4

File hashes

Hashes for fugashi-1.0.2a9-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 009dc7f882a6333b709f0270982d51b5eca12396913829ea377f4ef8ae45c9a2
MD5 da5b1c9f812d3c6a17b847b7a83be0c7
BLAKE2b-256 4e31a970b2aad035bd4a401f75715237ca498ebde8942443cd8b5e6431071247

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 473.2 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for fugashi-1.0.2a9-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9a749cfe49f1f552a0cce716287b0d07786836eba9312cde9ba51ee126cabc56
MD5 f332b3172c50f450a16129b349f62d70
BLAKE2b-256 7276a8ec0711542390acef352c0c5f062d0dd4119a8c52563b4cd8976c6ee078

See more details on using hashes here.

File details

Details for the file fugashi-1.0.2a9-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.2a9-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 281.0 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for fugashi-1.0.2a9-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 99866bf4b8cda955a8e013ee63f1abb5f16b57a67ef8d6c88513f1ede9517ca1
MD5 241e494cff30411de5097b1f884f840a
BLAKE2b-256 dbf48c12b7d17a81b04610c21e86f853e9ee6570dab958c90035954364820c86

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page