simplechinese

Chinese text processing, representation, and visualization.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

SimpleChinese

Chinese text processing, representation, and visualization.

Free software: MIT license
Documentation: https://simplechinese.readthedocs.io.

Features

Read the data from a csv file.

df = pd.read_csv("test.csv")

https://github.com/chenmingxiang110/SimpleChinese/raw/master/pics/raw.png

Clean the data.

sc.clean(df)

https://github.com/chenmingxiang110/SimpleChinese/raw/master/pics/clean.png

The clean function does the following:

fillna(): Fill the N/As in a pandas.DataFrame with an empty string.

toLower(): Transform alphabets to their lowercases.

remove_punctuations(): Remove all the punctuations in a string or a pandas.DataFrame.

remove_space(): Remove all the spaces in a string or a pandas.DataFrame.

Extract words from the data

sc.extract_words(sc.clean(df))

https://github.com/chenmingxiang110/SimpleChinese/raw/master/pics/extract_words.png

Vectorization

sc.pca(sc.tfidf(sc.clean(df).iloc[:,0]))

https://github.com/chenmingxiang110/SimpleChinese/raw/master/pics/vectorization.png

Word cloud

sc.wordcloud(sc.clean(df).iloc[:,0], font_path="yahei.ttc")

https://github.com/chenmingxiang110/SimpleChinese/raw/master/pics/wordcloud.png

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2020-07-10)

First release on PyPI.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

0.2.15

Jul 12, 2022

0.2.14

Nov 9, 2021

0.2.12

Nov 9, 2021

0.2.11

Aug 22, 2021

0.2.10

Jul 1, 2021

0.2.9

Jul 1, 2021

0.2.8

Jun 23, 2021

0.2.7

Jun 22, 2021

0.2.6

Jun 22, 2021

0.2.1

Jun 21, 2021

0.2.0

Jun 21, 2021

This version

0.1.0

Jul 10, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simplechinese-0.1.0.tar.gz (13.6 kB view details)

Uploaded Jul 10, 2020 Source

File details

Details for the file simplechinese-0.1.0.tar.gz.

File metadata

Download URL: simplechinese-0.1.0.tar.gz
Upload date: Jul 10, 2020
Size: 13.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/45.1.0.post20200127 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.6

File hashes

Hashes for simplechinese-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6ed60d5cdc66e8d151167a13fd0e7aace19edebc48d7183bfddfb548b1cc3aca`
MD5	`7bd57294213a726234447896878238c2`
BLAKE2b-256	`f0ef7b0580d485a556a5c6549e22abfa5f053305dcc137d613b326fdfa13e41a`

See more details on using hashes here.

simplechinese 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SimpleChinese

Features

Credits

History

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes