Skip to main content

Named Entity Segmentation

Project description

Named Entity Segmentation

简介

本项目是字符串令牌流分割库; neseg -n 中国北京市联想科技有限公司 -d dict

功能

  • 字符串令牌解析;
  • 支持令牌流;
  • 解析器可以是自定义字典机械分割,每个token一个独立字典;
  • 解析器也可以是正则表达式;
  • 分割分正向和反向,都是从头开始;
  • 生成对应令牌名称和解析出来的字符创元组,最后剩下的归为一组;

应用场景

  • 各种名称的解析,如中文机构名、药品名称、地址的分割标注;

TODO

  • 设计参考re.scanner;
  • 可以用生成器yield来做技术实现;
  • 程序返回元组列表;

附录 - 源码文件说明

neseg
    /lib
        FMM.py  正向切词
        RMM.py  反向切词
    seg.py      
    main.py   主程序:无界面,参数命令行
changelog.md    软件更新日志
readme.md       软件使用、安装指南

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neseg-0.7.2.tar.gz (4.9 kB view details)

Uploaded Source

File details

Details for the file neseg-0.7.2.tar.gz.

File metadata

  • Download URL: neseg-0.7.2.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for neseg-0.7.2.tar.gz
Algorithm Hash digest
SHA256 ae7f4b5bce95e431e96d1a1f114c67a4cfad9e87af180290f3514dffd759f6c6
MD5 67de008eb6fc5be2f9c1e4b2a5f64f73
BLAKE2b-256 c16182aed97ab2820feca405170acdfaa5f0db9938afa1ff267911cf7440e825

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page