Skip to main content

A Cython MeCab wrapper for fast, pythonic Japanese tokenization.

Project description

Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

See the blog post for background on why Fugashi exists and some of the design decisions.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

License and Copyright Notice

Fugashi is released under the terms of the MIT license. Please copy it far and wide.

Fugashi is a wrapper for MeCab, and Fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fugashi-1.0.4.tar.gz (334.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-1.0.4-cp38-cp38-win_amd64.whl (500.0 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-1.0.4-cp38-cp38-manylinux1_x86_64.whl (487.9 kB view details)

Uploaded CPython 3.8

fugashi-1.0.4-cp38-cp38-macosx_10_14_x86_64.whl (283.0 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-1.0.4-cp37-cp37m-win_amd64.whl (499.0 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-1.0.4-cp37-cp37m-manylinux1_x86_64.whl (477.2 kB view details)

Uploaded CPython 3.7m

fugashi-1.0.4-cp37-cp37m-macosx_10_14_x86_64.whl (282.4 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-1.0.4-cp36-cp36m-win_amd64.whl (498.9 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-1.0.4-cp36-cp36m-manylinux1_x86_64.whl (476.7 kB view details)

Uploaded CPython 3.6m

fugashi-1.0.4-cp36-cp36m-macosx_10_14_x86_64.whl (283.2 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-1.0.4-cp35-cp35m-win_amd64.whl (497.4 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-1.0.4-cp35-cp35m-manylinux1_x86_64.whl (473.1 kB view details)

Uploaded CPython 3.5m

fugashi-1.0.4-cp35-cp35m-macosx_10_14_x86_64.whl (280.7 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-1.0.4.tar.gz.

File metadata

  • Download URL: fugashi-1.0.4.tar.gz
  • Upload date:
  • Size: 334.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4.tar.gz
Algorithm Hash digest
SHA256 f60c4ee1937dfdfac98a5fd99602adca4f130cd9b4b9d0120b9f4e14376f91d5
MD5 bede8e9c91d049b1471e3c85e3336751
BLAKE2b-256 a50fd257833e9103b23240e1cc4573c09451a8333182cb98d76871a5978a0d52

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 500.0 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 6be2748d4796d8f62c7efe4d4c7c8ee44ec2968c92b50edcac08c557be0cab50
MD5 e2af4b14e256c664d27338326ce51949
BLAKE2b-256 a9fbb66e2522c7c535941c5fee0eb25b26d53324ac9e906b8601d0aaf342abfa

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 487.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7a6606cec6e6a7a76dbf6342d7dfb25da57c8faff72ac0f379eaa2c56343cd39
MD5 7c64040d855c13a53751a6238e36630a
BLAKE2b-256 ed4fe4b578a144eacfa8c1871f78ac25dcb84b198e603116047b37f3022c5fcd

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.0 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 39578bb62ffa3ddddfc6927d36ee23733319e416c3237111802bbd9f6169984b
MD5 7fed3a625b702339b6455e39d38160b8
BLAKE2b-256 a7597e846eea6819f9c44fca16e261157ef78b62670bc9ebcc00991b35ffdeaf

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 499.0 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.4-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 74f8315f804c33d52bb5844c1808b50a15f8ff67bb85f9a0ac714826b5dc7f0f
MD5 fff9e527861d2e7df56ee38e3690dfb2
BLAKE2b-256 64b75a38baa64c251802b49a81f468d160a3c3ac81847e3f3cb8caec35e67f80

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.2 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 dc3682974ba0920e58122712fec33d30dd58e4d3ea2e4bd384c598f4d071c7e7
MD5 68517d55092e2801a055cfd3b9a88446
BLAKE2b-256 69feed3258de1eb49bcd85d2fbf3ca70c86a6c80493b16886bf7e2546b413f7b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 282.4 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.4-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e1e6b33d625c4109244de60cb77ced8e9b5f20c3b9daf28ed57d48dc5a29373d
MD5 1952b089a562ee71bdfd4816bebb1d97
BLAKE2b-256 700d2934c77750135922d82e344444abddab41eaf8cc7321ce3ca9905f00cca5

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 498.9 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for fugashi-1.0.4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 9f79269d752d069e3772f40b7147e54ca8e50ed1e5352e4a4dd9888792e6bb65
MD5 40125d9b6ab12a5ce7d4f3b45296f0f9
BLAKE2b-256 1e71d1a3e23715419838fccf0ee2028667cebf0c913115a95ac72fc15b5c7ec7

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 476.7 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4daf6dacc0c65b0c262b7570ca70e890602d1fc186383b1b888bf97404b57cd2
MD5 4b34f459b52f3ccc1bf6b220a67cec98
BLAKE2b-256 320cd0bf73e1a90aeb3e696c7741a812d4b86adc31a8a9783cc92c535ae29016

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.2 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.11

File hashes

Hashes for fugashi-1.0.4-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d486dca2dd7d6723eb3a68593c9f0faba7a7cead6bb3a37afca3c9edbf8c9913
MD5 87c4ec668dc27ca89fe1504305effdd2
BLAKE2b-256 a6eb7f564c117253ab70a69273b769addea640192e9163def12792dccb072bc1

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 497.4 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.5.4

File hashes

Hashes for fugashi-1.0.4-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 e951446b994c0fc55e65e748c29c1203867b2fcc9f132ef29f32fc3330cb6c26
MD5 c68a0f31689019dbb24a8330f3be82a4
BLAKE2b-256 56f7c1a466a7105c865f699f9cc9bee3ebaf8a1da04d66d91dbd58a0d8dc0532

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 473.1 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.4-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1e187305f4b23d8f596591f60cfb81815ce6ff0aaaa167b132bd01c7a1699308
MD5 a1db2cc4fc64ed0a70947a94e84db472
BLAKE2b-256 e96c7581b1aecc19f1e67bbea8cd18ecd48a857e4711564a16e3170637b2ffdf

See more details on using hashes here.

File details

Details for the file fugashi-1.0.4-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.4-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.7 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.5.9

File hashes

Hashes for fugashi-1.0.4-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 36c3e22267609705fd929364674f270515a34ed961570d2f4c3d8e9f7fb36d96
MD5 2d21eb1b6d3cd23d392e3bd8a878d685
BLAKE2b-256 9ac0e8caf566b28b471f610dd04ab8e5ea1836a4eef4fe2677316c3660165789

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page