Skip to main content

Feature annotator based on KNP rule files

Project description

Desuwa

PyPI version Python Versions License Downloads

CircleCI CodeQL Maintainability Test Coverage markdownlint jsonlint yamllint

Feature annotator to morphemes and phrases based on KNP rule files (pure-Python)

Quick Start

Desuwa exploits Juman++ outputs.

$ pip install desuwa
$ echo '歌うのは楽しいですわ' | jumanpp | desuwa
+	["&表層:付与", "連体修飾", "用言:動"]
歌う うたう 歌う 動詞 2 * 0 子音動詞ワ行 12 基本形 2 "代表表記:歌う/うたう ドメイン:文化・芸術;レクリエーション"	["タグ単位始", "形態素連結-数詞", "固有修飾", "活用語", "文頭", "文節始", "T連体修飾", "ドメイン:文化・芸術;レクリエーション", "T固有付属", "内容語", "T固有末尾", "自立"]
+	["受けNONE", "外の関係", "形副名詞", "助詞", "T連用", "ハ", "タグ単位受:-1"]
の の の 名詞 6 形式名詞 8 * 0 * 0 NIL	["タグ単位始", "T動連用名詞化前文脈", "形態素連結-数詞", "固有修飾", "形副名詞", "特殊非見出語", "名詞相当語", "T固有付属", "付属", "内容語", "T固有末尾"]
は は は 助詞 9 副助詞 2 * 0 * 0 NIL	["形態素連結-数詞", "固有修飾", "T固有付属", "付属", "T固有末尾"]
+	["&表層:付与", "用言:形", "連体修飾", "助詞"]
楽しい たのしい 楽しい 形容詞 3 * 0 イ形容詞イ段 19 基本形 2 "代表表記:楽しい/たのしい"	["タグ単位始", "形態素連結-数詞", "固有修飾", "活用語", "文節始", "T連体修飾", "T固有付属", "内容語", "T固有末尾", "自立"]
です です です 助動詞 5 * 0 無活用型 26 基本形 2 NIL	["形態素連結-数詞", "固有修飾", "活用語", "T連体修飾", "T固有付属", "付属", "T固有末尾"]
わ わ わ 助詞 9 終助詞 4 * 0 * 0 NIL	["形態素連結-数詞", "固有修飾", "文末", "表現文末", "T固有付属", "付属", "T固有末尾"]
EOS

$ echo '歌うのは楽しいですわ' | jumanpp | desuwa | desuwa --predicate
歌う	歌う/うたう	1	動
楽しいですわ	楽しい/たのしい	1	形

$ echo '歌うのは楽しいですわ' | jumanpp | desuwa --segment
歌う│のは│楽しいですわ

Note

Desuwa is currently confirmed to work with the following rule files.

  • mrph_filter.rule
  • mrph_basic.rule
  • bnst_basic.rule

License

Apache License 2.0 except for rules files in desuwa/knp_rules imported from KNP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

desuwa-1.1.0.tar.gz (54.0 kB view details)

Uploaded Source

Built Distribution

desuwa-1.1.0-py3-none-any.whl (58.0 kB view details)

Uploaded Python 3

File details

Details for the file desuwa-1.1.0.tar.gz.

File metadata

  • Download URL: desuwa-1.1.0.tar.gz
  • Upload date:
  • Size: 54.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.9.7 Linux/4.19.0-18-cloud-amd64

File hashes

Hashes for desuwa-1.1.0.tar.gz
Algorithm Hash digest
SHA256 bbf520eb1d656798f142805899cacea3bcb7323e5277f16c522e5bcc943e4f11
MD5 bccdc5b6fdeb2df6b7502592feb89a4a
BLAKE2b-256 f5fe026c9e9862842fbd97499742caa14c2281180de66cfa5b84abd1e2756af7

See more details on using hashes here.

File details

Details for the file desuwa-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: desuwa-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 58.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.9.7 Linux/4.19.0-18-cloud-amd64

File hashes

Hashes for desuwa-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c86cd228ffebec88f00995c8d8ed452be9c0998246d0792284fdfedc8042bf40
MD5 ea704c840d5d16c98182f84103aab12f
BLAKE2b-256 d58c56651911fc44be45c9147e941b1dbb5d923bfb514dfa4a83c26912d02453

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page