This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

A lib for Chinese text preprocessing

Project Description

Chinese text preprocess

You can extract numbers, email, website, emoji, tex, and delete spaces, punctuations.

Install

>> pip install cnprep

Usage

from cnprep import Extractor
ext = Extractor(args=['email', 'number'], limit=5)
ext.extract(message)
args: option
    e.g. ['email', 'telephone'] or 'email, telephone'
    email
    telephone
    web
    QQ
    tex
    wechat
    message (without punctuation)
    blur (Ⅰ①壹...)
limit: parameter for get_number (blur)

Also, you can use ”ext.reset_param()” to reset the parameters.

Attention

The URL extractor only support ASCII

Release History

Release History

This version
History Node

0.1.10

History Node

0.1.9

History Node

0.1.8

History Node

0.1.7

History Node

0.1.6

History Node

0.1.5

History Node

0.1.4

History Node

0.1.3

History Node

0.1.2

History Node

0.1.1

History Node

0.1.0

History Node

0.0.5

History Node

0.0.4

History Node

0.0.3

History Node

0.0.2

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
cnprep-0.1.10.tar.gz (6.4 kB) Copy SHA256 Checksum SHA256 Source Oct 10, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting