Natural Language Processing (NLP) library for Urdu language.
Project description
Urduhack: NLP library for ( 🇵🇰 ) Urdu language
Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
🔥 Features Support
- Normalization
- Arabic and Urdu Unicode Redundancy Problem
- Character Normalization
- Combined Characters Normalization
- Diacritics Removal
- Spaces Before & After Digits
- Spaces After Punctuations
- Joined Words Fix
- Tokenization
- Sentence Tokenization
- Words Tokenization
- Data Pre-processing
- Handles all kind of numbers, emails, currencies and urls etc.
- Tasks
- Sentimental analysis
- Sentence classification
- Documents classification
- Name entity recognition
- Image to text
- Speech to text
- Datasets
- IMDB Urdu movies review dataset
- Hand written digits datasets
🛠 Installation
Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.
$ pip install urduhack
🔗 Documentation
Fantastic documentation is available at https://urduhack.readthedocs.io/
Documentation | |
---|---|
Installation | How to install Urduhack and download models |
Quickstart | New to Urduhack? Here's everything you need to know! |
API Reference | The detailed reference for Urduhack's API. |
How to Contribute
- Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
- Write a test which shows that the bug was fixed or that the feature works as expected.
- Send a pull request and bug the maintainer until it gets merged and published. :)
👍 Contributors
Special thanks to everyone who contributed to getting the UrduHack to the current state.
Backers 
Thank you to all our backers! 🙏 [Become a backer]
Sponsors 
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]
📝 Copyright and license
Code released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file urduhack-0.3.2.tar.gz
.
File metadata
- Download URL: urduhack-0.3.2.tar.gz
- Upload date:
- Size: 71.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66147eb1eb602cf0d63bf832472983f1bd514b58b5ae61b0cee3aa1f12f02631 |
|
MD5 | 74765b66562f57f7148abcfde2cf1415 |
|
BLAKE2b-256 | 045b7dcba46f9afb85d9761073192d488cb754c8080ab1a3fe7d6bcfe8bd9b31 |
File details
Details for the file urduhack-0.3.2-py3-none-any.whl
.
File metadata
- Download URL: urduhack-0.3.2-py3-none-any.whl
- Upload date:
- Size: 81.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 444596e1a171dd7e1edc37d3ca24fb5b304589a5b8387426f2efaf03684e2165 |
|
MD5 | f77c8813490c68f3908d51497a39c8a6 |
|
BLAKE2b-256 | 6ca41601461a006ef7518d971fb05465260361c5254c8741f1c180a32096e636 |