Skip to main content

A Cython MeCab wrapper for fast, pythonic Japanese tokenization.

Project description

Current PyPI packages Test Status PyPI - Downloads Supported Platforms

fugashi

Fugashi by Irasutoya

Fugashi is a Cython wrapper for MeCab, a Japanese tokenizer and morphological analysis tool. Wheels are provided for Linux, OSX, and Win64, and UniDic is easy to install.

issueを英語で書く必要はありません。

See the blog post for background on why Fugashi exists and some of the design decisions.

If you are on an unsupported platform (like PowerPC), you'll need to install MeCab first. It's recommended you install from source.

Usage

from fugashi import Tagger

tagger = Tagger('-Owakati')
text = "麩菓子は、麩を主材料とした日本の菓子。"
tagger.parse(text)
# => '麩 菓子 は 、 麩 を 主材 料 と し た 日本 の 菓子 。'
for word in tagger(text):
    print(word, word.feature.lemma, word.pos, sep='\t')
    # "feature" is the Unidic feature data as a named tuple

Installing a Dictionary

Fugashi requires a dictionary. UniDic is recommended, and two easy-to-install versions are provided.

  • unidic-lite, a 2013 version of Unidic that's relatively small
  • unidic, the latest UniDic 2.3.0, which is 1GB on disk and requires a separate download step

If you just want to make sure things work you can start with unidic-lite, but for more serious processing unidic is recommended. For production use you'll generally want to generate your own dictionary too; for details see the MeCab documentation.

To get either of these dictionaries, you can install them directly using pip or do the below:

pip install fugashi[unidic-lite]

# The full version of UniDic requires a separate download step
pip install fugashi[unidic]
python -m unidic download

Dictionary Use

Fugashi is written with the assumption you'll use Unidic to process Japanese, but it supports arbitrary dictionaries.

If you're using a dictionary besides Unidic you can use the GenericTagger like this:

from fugashi import GenericTagger
tagger = GenericTagger()

# parse can be used as normal
tagger.parse('something')
# features from the dictionary can be accessed by field numbers
for word in tagger(text):
    print(word.surface, word.feature[0])

You can also create a dictionary wrapper to get feature information as a named tuple.

from fugashi import GenericTagger, create_feature_wrapper
CustomFeatures = create_feature_wrapper('CustomFeatures', 'alpha beta gamma')
tagger = GenericTagger(wrapper=CustomFeatures)
for word in tagger.parseToNodeList(text):
    print(word.surface, word.feature.alpha)

Alternatives

If you have a problem with Fugashi feel free to open an issue. However, there are some cases where it might be better to use a different library.

  • If you don't want to deal with installing MeCab at all, try SudachiPy.
  • If you need to work with Korean, try KoNLPy.

License and Copyright Notice

Fugashi is released under the terms of the MIT license. Please copy it far and wide.

Fugashi is a wrapper for MeCab, and Fugashi wheels include MeCab binaries. MeCab is copyrighted free software by Taku Kudo <taku@chasen.org> and Nippon Telegraph and Telephone Corporation, and is redistributed under the BSD License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fugashi-1.0.3.tar.gz (334.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fugashi-1.0.3-cp38-cp38-win_amd64.whl (500.0 kB view details)

Uploaded CPython 3.8Windows x86-64

fugashi-1.0.3-cp38-cp38-manylinux1_x86_64.whl (487.9 kB view details)

Uploaded CPython 3.8

fugashi-1.0.3-cp38-cp38-macosx_10_14_x86_64.whl (283.0 kB view details)

Uploaded CPython 3.8macOS 10.14+ x86-64

fugashi-1.0.3-cp37-cp37m-win_amd64.whl (499.0 kB view details)

Uploaded CPython 3.7mWindows x86-64

fugashi-1.0.3-cp37-cp37m-manylinux1_x86_64.whl (477.2 kB view details)

Uploaded CPython 3.7m

fugashi-1.0.3-cp37-cp37m-macosx_10_14_x86_64.whl (282.4 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

fugashi-1.0.3-cp36-cp36m-win_amd64.whl (498.9 kB view details)

Uploaded CPython 3.6mWindows x86-64

fugashi-1.0.3-cp36-cp36m-manylinux1_x86_64.whl (476.7 kB view details)

Uploaded CPython 3.6m

fugashi-1.0.3-cp36-cp36m-macosx_10_14_x86_64.whl (283.2 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

fugashi-1.0.3-cp35-cp35m-win_amd64.whl (497.4 kB view details)

Uploaded CPython 3.5mWindows x86-64

fugashi-1.0.3-cp35-cp35m-manylinux1_x86_64.whl (473.1 kB view details)

Uploaded CPython 3.5m

fugashi-1.0.3-cp35-cp35m-macosx_10_14_x86_64.whl (280.7 kB view details)

Uploaded CPython 3.5mmacOS 10.14+ x86-64

File details

Details for the file fugashi-1.0.3.tar.gz.

File metadata

  • Download URL: fugashi-1.0.3.tar.gz
  • Upload date:
  • Size: 334.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3.tar.gz
Algorithm Hash digest
SHA256 1d5ab07ed5c7dc9e43c2eb925818767c98945f59b1cd515ad52ebd035069e3e3
MD5 1e39cb52acdb5423f9cf1e60362ae1ad
BLAKE2b-256 4cec0fc7e95e61d3dd2077de3c53b99ddb0573eeff95062e56e1f1afc2572d4a

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 500.0 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 abb97e88eeccef65319425f9728264954ce9f39df37259a25a6a3878bff921f2
MD5 0be1e266e76f23780ea168edb5e82d7b
BLAKE2b-256 efa21d60b03871f080c5ef60ad7d03837fef030f07630fd61fda9bc0a95147cd

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 487.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0af715a243c0bc143192ee2e0b041f121023c57431460fb9d8e4aeadf20262aa
MD5 b3fd5ac094951a0f22d5203bddbbfabe
BLAKE2b-256 afbd737e526f7b594f93207dca7b14f4ee04dd17db548dc784ef52b9312f1a9d

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.0 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 18d2d4a4248d9b67066feac320e03d3d72c6535326d73ce6cd0e04a0dd87004a
MD5 b7b782a8145a48c24c9f0594449a65be
BLAKE2b-256 d078cc1a2e673b1b868edf7d1b2cb3cb2f31ac93e0aeac45363fee4529947cf4

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 499.0 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 25f8c0250121770689fa6c7f5d0c9368799af4820fb5dcc0217ffbc9726698ec
MD5 19ecb6fca498cf9f2208ce0de1bc3ad9
BLAKE2b-256 47e3c00ec90a1f5cafdfeebea91fc0fde79865e239a64e80daa1e44238a39521

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 477.2 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 405844363acbc9276f73a9dae6ceb257d89941106bf25badfee2f81241f35c85
MD5 0a1134c6a72895cc24bccd9e7dbb3c33
BLAKE2b-256 10ebb36636477f47823a42bad3281799e3a7d2b41e02185eb2c76d11dd1316f9

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 282.4 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for fugashi-1.0.3-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 52652b6747201ca1d076b3d149471e1db98e579fdd7204fdfad194e8c30bf904
MD5 8fef4d2cad6a24087d3271d1994c098e
BLAKE2b-256 3de4bf102eec581841a0abafd37e14e26b13576e936e16a2b1ba8e384f89df9e

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 498.9 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.8

File hashes

Hashes for fugashi-1.0.3-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 07f30349ccecfa8b9ee9c7b3a3a1ae5670b8cebb011fb45e8b6557147060fa26
MD5 7bbc9b09c0be56878b68decd20cd44a7
BLAKE2b-256 cfb1ec49329820e7e73c7a8bd3f10c2d8f3eeb664b2f6cf9a933f3e63d05d43b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 476.7 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bf82a1ca76d73ed0d15ad7c2bb092e24046d622d5b3c2ad9b7fdb6ffe79f15bd
MD5 1d2b94fa1ad85db8ed51da5b8fd13b56
BLAKE2b-256 4b0ad074968476a5d600536d3c8a750792e2a6743846d3e818e8bede011c173b

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 283.2 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.11

File hashes

Hashes for fugashi-1.0.3-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5d49d96a8894fd5c10a367b01e6cfac5ece26a51f66839c4ba602d6417475a2f
MD5 5bc75585ecfc0a575d990f440c1e212d
BLAKE2b-256 b9ae3ba463b794265d8ca33dc259e8910ba305e696752d0df7e6616ac1bf73de

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 497.4 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.5.4

File hashes

Hashes for fugashi-1.0.3-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 d6027550f037c130a6fd68eaa237621e354f9548e68517fca880ceab01f394a0
MD5 0a0ccb24e1db7db215e76fdb25983c1f
BLAKE2b-256 8b4ee13dc10e15977a2a678a01a88f025d55ac05d91a33249359bcbbdd4aa643

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 473.1 kB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for fugashi-1.0.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 36863e5eace340ac49ed91c40b647a1992a03ba94e51d6036c8d36d0c2f78c01
MD5 9e7e1dba0a5a44606f8370ce85fa178b
BLAKE2b-256 eee9a3853a1cdcd8f8db72112a6d8ded1abf1fb9a5c4b4ec3b2b65ed48a654cd

See more details on using hashes here.

File details

Details for the file fugashi-1.0.3-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: fugashi-1.0.3-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 280.7 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.5.9

File hashes

Hashes for fugashi-1.0.3-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f8a1d3dab3a780d70e17bc9a8182d43322df6784f98889e25d8f3b4849cada98
MD5 669a14554da9966c0274e5caea94150e
BLAKE2b-256 eb927633d7a49c3fe2804b134e2740c7267a48cd1feff18249ff71a4be5ff9fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page