General tokenizer
Project description
cutcut通用分词工具
在开源数据上使用albert进行实体识别,切分句子,得到对应的单词序列。
更新说明
2021-06-30
- 基本完成分词功能,后期需要增加自定义词典及自定义词添加。
- 将模型打包成wheel格式,使用pip进行安装。
使用说明
- 调用get_wheel.sh生成安装文件,在dist目录下;
- 使用pip install XXX.whl文件;
- 在python中使用import cutcut引入分词包;
- 使用cutcut.lcut进行分词。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cutcut-0.0.2.tar.gz
(15.7 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
cutcut-0.0.2-py3-none-any.whl
(15.7 MB
view details)
File details
Details for the file cutcut-0.0.2.tar.gz.
File metadata
- Download URL: cutcut-0.0.2.tar.gz
- Upload date:
- Size: 15.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3586f861589581698b66bc2e94b2da4a17af9aa0e5f98fdcd9642e3859d309f9
|
|
| MD5 |
8f3bd59cd3a29bb99bf2a37227177320
|
|
| BLAKE2b-256 |
e9a4b53d7371bb617c1d85ad281ea4bbfae811347cf91a5bfd3b3d6b0ae32fa6
|
File details
Details for the file cutcut-0.0.2-py3-none-any.whl.
File metadata
- Download URL: cutcut-0.0.2-py3-none-any.whl
- Upload date:
- Size: 15.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb933ece45b5b2f9ff4b3f31872bc0d9c4d38bd17240becc301749daec0f1460
|
|
| MD5 |
cccef4614638a72c0cf23c70c5e6e976
|
|
| BLAKE2b-256 |
d97d90dddcfc0889533cb9aa88f8a80d6536189f7f0a4b7dbe65ecf097346cc9
|