Skip to main content

Wrapper of `mecab`, which provide interface like `janome`.

Project description

wakame

janomeライクなインターフェイスを提供するmecabのラッパーです.

利用方法

import MeCab
from wakame.tokenizer import Tokenizer
from wakame.analyzer import Analyzer
from wakame.charfilter import *
from wakame.tokenfilter import *

text = '和布ちゃんこんにちは'

# 基本的な使い方
tokenizer = Tokenizer()
tokens = tokenizer.tokenize(text)
for token in tokens:
    print(token)

# 分かち書き
tokens = tokenizer.tokenize(text, wakati=True)
print(tokens)

# 辞書をNEologdにする場合
tokenizer = Tokenizer(use_neologd=True)
tokens = tokenizer.tokenize(text)
for token in tokens:
    print(token)

# filterを利用する場合
char_filters = [RegexReplaceCharFilter('和布', 'wakame')]
token_filters = [POSKeepFilter('名詞'), POSStopFilter(['名詞,接尾'])]
analyzer = Analyzer(tokenizer, char_filters=char_filters, token_filters=token_filters)
tokens = analyzer.analyze(text)
for token in tokens:
    print(token)

# tokenの情報をDataFrameで用いる場合
tokenizer = Tokenizer()
analyzer = Analyzer(tokenizer)
df = analyzer.analyze_with_dataframe(text)
print(df)

インストール

MeCabのインストール(必須)

brew install mecab
brew install mecab-ipadic

mecab-ipadic-NEologdのインストール(任意)

brew install git curl xz
git clone --depth 1 git@github.com:neologd/mecab-ipadic-neologd.git
cd mecab-ipadic-neologd
./bin/install-mecab-ipadic-neologd -n

詳しくはこちらを参照してください

mecab-python3のインストール(必須)

brew install swig
pip install mecab-python3

wakameのインストール(必須)

pip install wakame

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wakame, version 0.3.0
Filename, size File type Python version Upload date Hashes
Filename, size wakame-0.3.0-py3-none-any.whl (7.4 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size wakame-0.3.0.tar.gz (5.1 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page