Skip to main content

A Python package to extract hindi characters.

Project description


A command line based solution to pre-process hindi dataset and its cleaning. The abilities of this package includes-

  • pre-processing given file into hindi characters only
  • splitting paragraphs into sentences
  • removal of punctuations from the dataset (if required)


extract -l -p <y/yes (to keep punctuation)>

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textcleaner_hi-1.0.0.tar.gz (2.5 kB view hashes)

Uploaded source

Built Distribution

textcleaner_hi-1.0.0-py3-none-any.whl (3.9 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page