Skip to main content

ailia Tokenizer

Project description

ailia Tokenizer Python API

!! CAUTION !! “ailia” IS NOT OPEN SOURCE SOFTWARE (OSS). As long as user complies with the conditions stated in License Document, user may use the Software for free of charge, but the Software is basically paid software.

About ailia Tokenizer

The ailia Tokenizer is an NLP tokenizer that can be used from Unity or C++. The tokenizer is an API for converting text into tokens (sequences of symbols) that AI can handle, or for converting tokens back into text.

Traditionally, tokenization has been performed using Pytorch's Transformers. However, since Transformers only work with Python, there has been an issue of not being able to tokenize from applications on Android or iOS.

With ailia Tokenizer, this problem is solved by directly performing NLP tokenization without using Pytorch's Transforms. This makes it possible to perform tokenization on Android and iOS as well.

Since ailia Tokenizer includes Mecab and SentencePiece, it is possible to perform complex tokenizations, such as those for BERT Japanese or Sentence Transformer, on the device.

Install from pip

You can install the ailia SDK free evaluation package with the following command.

pip3 install ailia_tokenizer

Install from package

You can install the ailia SDK from Package with the following command.

python3 bootstrap.py
pip3 install .

API specification

https://github.com/ailia-ai/ailia-sdk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ailia_tokenizer-1.6.0.tar.gz (17.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ailia_tokenizer-1.6.0-py3-none-any.whl (17.5 MB view details)

Uploaded Python 3

File details

Details for the file ailia_tokenizer-1.6.0.tar.gz.

File metadata

  • Download URL: ailia_tokenizer-1.6.0.tar.gz
  • Upload date:
  • Size: 17.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for ailia_tokenizer-1.6.0.tar.gz
Algorithm Hash digest
SHA256 c594fa44d5ec9a4592d4adcb462500b71ae787f2e89aa891280fe3316fe7da6f
MD5 dff57e717adf2abd92c17718e8293153
BLAKE2b-256 f5f2aa8a66a4dfd495780db8c150ef7fdfcf2cdb6f1c8aa2a40c95d741cb2540

See more details on using hashes here.

File details

Details for the file ailia_tokenizer-1.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ailia_tokenizer-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e817c059b98c11e41c0d5acd88982794a8e345009771ba7f44974710bf928684
MD5 8ca55c35083a54e4f87495299d5a00ec
BLAKE2b-256 4829da65e07c5a844447ee7afa86ca55cc3a6f9d497e03036fd3a83a3faea2c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page