Skip to main content

🎤 vibrato: VIterbi-Based acceleRAted TOkenizer

Project description

🐍 python-vibrato 🎤

Vibrato is a fast implementation of tokenization (or morphological analysis) based on the Viterbi algorithm. This is a Python wrapper for Vibrato.

PyPI Build Status Documentation Status

Installation

Install pre-built package from PyPI

Run the following command:

$ pip install vibrato

Build from source

You need to install the Rust compiler following the documentation beforehand. vibrato uses pyproject.toml, so you also need to upgrade pip to version 19 or later.

$ pip install --upgrade pip

After setting up the environment, you can install vibrato as follows:

$ pip install git+https://github.com/daac-tools/python-vibrato

Example Usage

python-vibrato does not contain model files. To perform tokenization, follow the document of Vibrato to download distribution models or train your own models beforehand.

Check the version number as shown below to use compatible models:

>>> import vibrato
>>> vibrato.VIBRATO_VERSION
'0.5.2'

Examples:

>>> import vibrato

>>> with open('tests/data/system.dic', 'rb') as fp:
...     tokenizer = vibrato.Vibrato(fp.read())

>>> tokens = tokenizer.tokenize('社長は火星猫だ')

>>> len(tokens)
5

>>> tokens[0]
Token { surface: "社長", feature: "名詞,普通名詞,一般,*" }

>>> tokens[0].surface()
'社長'

>>> tokens[0].feature()
'名詞,普通名詞,一般,*'

>>> tokens[0].start()
0

>>> tokens[0].end()
2

Note for distributed models

The distributed models are compressed in zstd format. If you want to load these compressed models, you must decompress them outside the API.

>>> import vibrato
>>> import zstandard  # zstandard package in PyPI

>>> dctx = zstandard.ZstdDecompressor()
>>> with open('tests/data/system.dic.zst', 'rb') as fp:
...     with dctx.stream_reader(fp) as dict_reader:
...         tokenizer = vibrato.Vibrato(dict_reader.read())

License

Licensed under either of

at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vibrato-0.2.3.tar.gz (19.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vibrato-0.2.3-cp310-abi3-win_amd64.whl (220.3 kB view details)

Uploaded CPython 3.10+Windows x86-64

vibrato-0.2.3-cp310-abi3-win32.whl (208.0 kB view details)

Uploaded CPython 3.10+Windows x86

vibrato-0.2.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (371.7 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

vibrato-0.2.3-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (400.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

vibrato-0.2.3-cp310-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (638.5 kB view details)

Uploaded CPython 3.10+macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

File details

Details for the file vibrato-0.2.3.tar.gz.

File metadata

  • Download URL: vibrato-0.2.3.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3.tar.gz
Algorithm Hash digest
SHA256 c535c37e4b6830701749bd2ee648223f2ee7e20930929e36f0bfe2dc49a86b37
MD5 0421c386134972f529c2d5b21b3daf20
BLAKE2b-256 f7c7078e647260baadb134b884c653eec293d93ef510fee68b6b643c975168d6

See more details on using hashes here.

File details

Details for the file vibrato-0.2.3-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: vibrato-0.2.3-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 220.3 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d3b389465d61cd7e9201b7feda2139773d0fd240f82f01b9170b23d529e19869
MD5 6af0305a142a09e779dfe9e7f448c9bd
BLAKE2b-256 3f098a52848ee5638c9f7735fc11cc9fccc7b7f0e9803d53247ab4f81d554dd1

See more details on using hashes here.

File details

Details for the file vibrato-0.2.3-cp310-abi3-win32.whl.

File metadata

  • Download URL: vibrato-0.2.3-cp310-abi3-win32.whl
  • Upload date:
  • Size: 208.0 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 7b236e907f865077a80641239c2fedecb9cbba2b8c9f86be7de11076e2a23606
MD5 664101b21ed86df5fb9483b826d08e27
BLAKE2b-256 8b060297ea3e4cb7d334b89ac3ae384282d54094326568dc07eb3144b0e977b0

See more details on using hashes here.

File details

Details for the file vibrato-0.2.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: vibrato-0.2.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 371.7 kB
  • Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 00f4a2191a97056d3e3f3289fb19105309ffc96a983f44490f603488898cce37
MD5 ed6016c67b2c11c977c1704773b9a3bf
BLAKE2b-256 40db142a172c8c1820e7285def270573942bbc9e1e889eecb3573edbe2afd679

See more details on using hashes here.

File details

Details for the file vibrato-0.2.3-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

  • Download URL: vibrato-0.2.3-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
  • Upload date:
  • Size: 400.1 kB
  • Tags: CPython 3.10+, manylinux: glibc 2.5+ i686
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 ffb0599a41a503c815fc6271f6de05433f19a88cf71b65b4293e71ffb47159f0
MD5 67ab66cea5ed3b548481306743fdf287
BLAKE2b-256 1e7ab6529f23ee5020bd1ddd64b1eb994d27170cdbfcd403551156ef129ec9ad

See more details on using hashes here.

File details

Details for the file vibrato-0.2.3-cp310-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

  • Download URL: vibrato-0.2.3-cp310-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
  • Upload date:
  • Size: 638.5 kB
  • Tags: CPython 3.10+, macOS 10.12+ universal2 (ARM64, x86-64), macOS 10.12+ x86-64, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vibrato-0.2.3-cp310-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 c708f424c9fa814b5ae1aab2053fccad7b5219d81f09481d590bbf1379345a2e
MD5 c58aea0f6269230d7c2479721197644d
BLAKE2b-256 c128bf36ce997204a256c343b731652ffd488fe3e875c69b240947d8a744febe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page