Skip to main content

ducut

Project description

电商分词

安装

pip install ducut

使用方式

from ducut import DuCut
resource_path = '<自定义的资源文件>'
dc = DuCut(resource_path)
line = '万斯 板鞋白红'
cu = dc.cut_query(line)
print(f"brand:{cu.brand},series:{cu.series},color:{cu.color},category:{cu.category},word:{cu.word},proper:{cu.proper}")

# 加载自定义词典
dc.add_word_file("<词典路径>")
# 加载自定义单词
dc.add_word('川久保玲')

思路

  • 语义实体:主要用于一些系统尚未识别的实体词,干预后,该词的切分总是能保持一致,不受其所在的上下文影响。
  • 语义切分:用于指定在特定上下文中,短语的切分方式,而不影响该短语在其他上下文中的切分方式

参考资料

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ducut-0.0.3.tar.gz (6.0 kB view details)

Uploaded Source

File details

Details for the file ducut-0.0.3.tar.gz.

File metadata

  • Download URL: ducut-0.0.3.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0.post20200309 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for ducut-0.0.3.tar.gz
Algorithm Hash digest
SHA256 703d9b4e3ce3f78c91317a3aec4918280fb82a96d23d1c9904761080634c78a8
MD5 70776800aae24f8f5ab058d1b79c8eaa
BLAKE2b-256 cb7cd88037a6b3fac036e8c3503961ad49bb757524e92376dc1c1813fa0adf6b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page