Skip to main content

Tokenizes Japanese documents to enable CRUD operations.

Project description

Jadoc: Tokenizes Japanese Documents to Enable CRUD Operations

PyPI Version Python Versions License Code style: black Imports: isort

Installation

Install MeCab

MeCab is required for Jadoc to work. If it is not already installed, install MeCab first.

Install Jadoc

$ pip install jadoc

Examples

from jadoc.doc import Doc


doc = Doc("本を書きました。")

# print surface forms of the tokens.
surfaces = [word.surface for word in doc.words]
print("/".join(surfaces))  # 本/を/書き/まし/た/。

# print plain text
print(doc.get_text())  # 本を書きました。

# delete a word
doc.delete(3)  # Word conjugation will be done as needed.
print(doc.get_text())  # 本を書いた。

# update a word
word = doc.conjugation.tokenize("読む")
# In addition to conjugation, transform the peripheral words as needed.
doc.update(2, word)
print(doc.get_text())  # 本を読んだ。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jadoc-0.2.5.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

jadoc-0.2.5-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file jadoc-0.2.5.tar.gz.

File metadata

  • Download URL: jadoc-0.2.5.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.9.1 Linux/4.18.0-240.1.1.el8_3.x86_64

File hashes

Hashes for jadoc-0.2.5.tar.gz
Algorithm Hash digest
SHA256 3e627fcf9ad00c5b179bb885fd845168f96b0c9945a006d5d466694d14ce1417
MD5 778e2a344cc67a273f322bc650bf52fa
BLAKE2b-256 d45c93a0656a94df309bdfc05e30d45923db2225921a45869b38a7b3b0d91ba5

See more details on using hashes here.

File details

Details for the file jadoc-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: jadoc-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.9.1 Linux/4.18.0-240.1.1.el8_3.x86_64

File hashes

Hashes for jadoc-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 85c7ffd8652eeb96089e204bbafd4b383fc068e49270c4880331e2c32a54d7ef
MD5 5c9b176b13c8af00775f14b05702aec1
BLAKE2b-256 52dd7c32cb55247b1c3e6747943910ed68cc177b7ab59de8015ca364b324b58b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page