A simple sentencepiece encoder and decoder without any dependency.
Project description
simple-sentencepiece
A simple sentencepiece encoder and decoder.
Note: This is not a new sentencepiece toolkit, it just uses google's sentencepiece model as input and encode the string to ids/pieces or decode the ids to string. The advantage of this tool is that it doesn't have any dependency (no protobuf), so it will be easier to integrate it into a C++ project.
Installation
pip install simple-sentencepiece
Usage
The usage is very similar to sentencepiece, it also has encode and decode interface.
from ssentencepiece import Ssentencepiece
# you can get bpe.vocab from a trained bpe model, see google's sentencepiece for details
ssp = Ssentencepiece("/path/to/bpe.vocab")
# output ids
ids = ssp.encode(["HELLO WORLD", "LOVE AND PIECE"])
# output string pieces
pieces = ssp.encode(["HELLO WORLD", "LOVE AND PIECE"], out_type=str)
# decode
res = ssp.decode(ids)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simple-sentencepiece-0.2.tar.gz.
File metadata
- Download URL: simple-sentencepiece-0.2.tar.gz
- Upload date:
- Size: 351.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc641e23fd404dc8f4587f6f420cc3446d8cc28780f2f506564da36a39ce06fe
|
|
| MD5 |
d5afb57fb4bf06ed2d6607b0de9153f4
|
|
| BLAKE2b-256 |
e244af624aecaa44f080b1e45049d0e497cd93abc1612214078953f960509d3e
|
File details
Details for the file simple_sentencepiece-0.2-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 237.7 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a223f50fee07a47e873a7c0c8d8ffd7fe61871fffeca6ad8db3a75bf6af76d6
|
|
| MD5 |
1bd34f05cc8012c0f8bc2ad203600c19
|
|
| BLAKE2b-256 |
4c7d72db48ed6b3fe3e677f3e01303392464b3ed4ca0a0e08636699994b39d32
|
File details
Details for the file simple_sentencepiece-0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.1 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37c98ca83081f90e5a5d9846cffbad1e2c345c8ab161dcc4c2d4b2f9dcf742ed
|
|
| MD5 |
0dd694dc4c62a32fbeff082820320530
|
|
| BLAKE2b-256 |
67cfa8c48169d23ff1f8df0b919580b2b1682bc024dd63e04099826f0d7d0b89
|
File details
Details for the file simple_sentencepiece-0.2-cp312-cp312-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp312-cp312-macosx_10_9_x86_64.whl
- Upload date:
- Size: 120.8 kB
- Tags: CPython 3.12, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c20fc8cb97db80e0c8088cdaec8cb224b9b83c333dc3fd70a0e7a624c312dd23
|
|
| MD5 |
f92aa5413b5ddaae2d6a71b7d55a5730
|
|
| BLAKE2b-256 |
12d751c4b302396b978f3eb1f505b07b63325e54511ce12e26096c1e21f4eae8
|
File details
Details for the file simple_sentencepiece-0.2-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 237.4 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3633afa559030ddabfe2f056aa3a58381004c5f8101bd3f426c732fa1228333
|
|
| MD5 |
594971eb2465a18d4c26791622095b30
|
|
| BLAKE2b-256 |
0df69929d26d8c82867841433b181d78e31c9e1d4c72accee6dab0449c835495
|
File details
Details for the file simple_sentencepiece-0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.3 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c40131778a12734e34ed7790902250296659127b67876c741e56e91056c529d7
|
|
| MD5 |
65816dfe7376dd29435698cfd77e6d8f
|
|
| BLAKE2b-256 |
e256ab691fa17dca7a487ea2707552c3c60703119fc4a9c03f6aee7b2207928d
|
File details
Details for the file simple_sentencepiece-0.2-cp311-cp311-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 119.3 kB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46e1ca997d7638a4659abc3ca0d38683b02fb6710e1e1989710e36fc1e3630a2
|
|
| MD5 |
896d214ca00c836ca101826ef041f67f
|
|
| BLAKE2b-256 |
de62aeb080bf26e1edf503e58ada378d9c229a3248e48a5635cb88d14b948ba7
|
File details
Details for the file simple_sentencepiece-0.2-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 237.4 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f88a1d9e19f351338b34fff96a6632a768bd5739ba28d1728cf028e8521eeb1
|
|
| MD5 |
51279eecd9bf095da5a46911086068e2
|
|
| BLAKE2b-256 |
763f7c029b952f0926ab99b748d3510da0a26c1d139dc64d80417aec778fc671
|
File details
Details for the file simple_sentencepiece-0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.4 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c627f734fa3870d6c95eb509fbe0e255dcbf25b04d05ee63178cec6b55a8220
|
|
| MD5 |
9789ad29f592174bd23e5128b749adbc
|
|
| BLAKE2b-256 |
b4f39a03d9878994c34c40bd3278662c4d9bbe00326fcf66b76d2c60a784801f
|
File details
Details for the file simple_sentencepiece-0.2-cp310-cp310-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 119.3 kB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66948e6e9760606e3615ce83e30bb693ef115a99bc131019aa6f4354eda34378
|
|
| MD5 |
7355ffb7ce5c0d2a6c4c076a54b679c4
|
|
| BLAKE2b-256 |
36246d915766d0c7a97b17ac51345a42b3e8b1b51011ca447b875523819f7e94
|
File details
Details for the file simple_sentencepiece-0.2-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 237.4 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a925ae1b272f3ad3b1b9dc570b7b134e74bf31156f3ddc7d2547a34f77d18d81
|
|
| MD5 |
066a62e2cd7ff2b6146fe4ffef5f70a4
|
|
| BLAKE2b-256 |
a37e92d58aa6c94463583d5792bc38fccff16f6338916691abe9e321c04658cf
|
File details
Details for the file simple_sentencepiece-0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.6 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5720943cc66ab0977f33e28dd9ef28e0b0ed99141aafb1c4ea75bbe04def5cc1
|
|
| MD5 |
1adcdea51a7a1470b8c0111d89b04d8c
|
|
| BLAKE2b-256 |
4e92d144e2bbdb58a026edb7f0b9b2c4fde2f61c03d3f507eea5dade6682e205
|
File details
Details for the file simple_sentencepiece-0.2-cp39-cp39-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 119.5 kB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55f58f93c708546b7f9a225195d710274e209f134988b3b5104fc25167bfcd68
|
|
| MD5 |
ec26ceb6d073ceb6c101ffda69c76428
|
|
| BLAKE2b-256 |
67e693ffd6a790a7878cd4b1b9d009eddea9c52687f367394c95b8cf46edc144
|
File details
Details for the file simple_sentencepiece-0.2-cp38-cp38-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 237.1 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8a9653eb2ade0df9adb3f6c2f0aac8e99e3b8be6b21e8f81c6fcb53fde6f1a7
|
|
| MD5 |
2445a5bf43c300c9a4838b0eead68617
|
|
| BLAKE2b-256 |
929405e41961b7de064ac9182e3cd4a83f21d836f2ecb8757aec2cb6c82374d9
|
File details
Details for the file simple_sentencepiece-0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.4 kB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b5e48d8ddea4282a397b1e3ba9300d047afdc4d436108414c7581ed64322ab9
|
|
| MD5 |
82d61652ccc576b7555254d3eebb9da1
|
|
| BLAKE2b-256 |
656bd2396ca0170e56311fd9041d5ac897b358a6f4f049d2d29705b73751e4ff
|
File details
Details for the file simple_sentencepiece-0.2-cp38-cp38-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp38-cp38-macosx_10_9_x86_64.whl
- Upload date:
- Size: 119.4 kB
- Tags: CPython 3.8, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c722a5ff90a316454f7ca10d1f8d690f82e6413d9d1318c6313c8ca9ed23682
|
|
| MD5 |
4e9d7a48bf2b6afde8c2c75e776ff123
|
|
| BLAKE2b-256 |
de6caac38630420edd39ee440b022e03003697f065b0efa1af9becd5ffae8867
|
File details
Details for the file simple_sentencepiece-0.2-cp37-cp37m-win_amd64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 237.6 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e7a2a240eb51582a10d2379bbc37e570b1d69a3c0d80b62a601cc51d515fd07
|
|
| MD5 |
ce957e6822b99041daadb9f30666de8f
|
|
| BLAKE2b-256 |
ace8f6d903cb1328a7f9a524f3d54dbb1bff1c3a520c7e39d51c9ca52d0fd374
|
File details
Details for the file simple_sentencepiece-0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 168.3 kB
- Tags: CPython 3.7m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03363ed0f2c726877f53b816e4e4b42dc213354772641ccb9d46000cdd150f6f
|
|
| MD5 |
b23fb591f29840ad048e5a607e3b06c6
|
|
| BLAKE2b-256 |
83e512f659b2de0c3bc775cc34f316cccdc47ee0e0ecfdfe16d0ada8bc744988
|
File details
Details for the file simple_sentencepiece-0.2-cp37-cp37m-macosx_10_9_x86_64.whl.
File metadata
- Download URL: simple_sentencepiece-0.2-cp37-cp37m-macosx_10_9_x86_64.whl
- Upload date:
- Size: 119.1 kB
- Tags: CPython 3.7m, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
433095e65ab95a9182b074c63c56495e0e11d3851d6ca577e9fada3e1b66f6eb
|
|
| MD5 |
9904bf55b21d8e265058943265b54e5b
|
|
| BLAKE2b-256 |
1f059b9e231c6fc965291c480ee03601625b8f1476a5abcea6679521753a90f2
|