ducut
Project description
电商分词
安装
pip install ducut
使用方式
from ducut import DuCut
resource_path = '<自定义的资源文件>'
dc = DuCut(resource_path)
line = '万斯 板鞋白红'
cu = dc.cut_query(line)
print(f"brand:{cu.brand},series:{cu.series},color:{cu.color},category:{cu.category},word:{cu.word},proper:{cu.proper}")
# 加载自定义词典
dc.add_word_file("<词典路径>")
# 加载自定义单词
dc.add_word('川久保玲')
思路
- 语义实体:主要用于一些系统尚未识别的实体词,干预后,该词的切分总是能保持一致,不受其所在的上下文影响。
- 语义切分:用于指定在特定上下文中,短语的切分方式,而不影响该短语在其他上下文中的切分方式
参考资料
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ducut-0.0.3.tar.gz
(6.0 kB
view details)
File details
Details for the file ducut-0.0.3.tar.gz
.
File metadata
- Download URL: ducut-0.0.3.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0.post20200309 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 703d9b4e3ce3f78c91317a3aec4918280fb82a96d23d1c9904761080634c78a8 |
|
MD5 | 70776800aae24f8f5ab058d1b79c8eaa |
|
BLAKE2b-256 | cb7cd88037a6b3fac036e8c3503961ad49bb757524e92376dc1c1813fa0adf6b |