Natural Language Processing (NLP) library for Urdu language.
Urduhack: A Python NLP library for Urdu language
Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
Note: Releasing a stable version v1.0.0 soon with lots of new models and new api.
- Academic users Easier experimentation to prove their hypothesis without coding from scratch.
- NLP beginners Learn how to build an NLP project with production level code quality.
- NLP developers Build a production level application within minutes.
🔥 Features Support
- Arabic and Urdu Unicode Redundancy Problem
- Character Normalization
- Combined Characters Normalization
- Diacritics Removal
- Spaces Before & After Digits
- Spaces After Punctuations
- Joined Words Fix
- Sentence Tokenization
- Words Tokenization
- Data Pre-processing
- Handles all kind of numbers, emails, currencies and urls etc.
- Sentimental analysis
- Sentence classification
- Documents classification
- Name entity recognition
- Image to text
- Speech to text
- IMDB Urdu movies review dataset
- Hand written digits datasets
Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.
Installing with tensorflow cpu version.
$ pip install urduhack[tf]
Installing with tensorflow gpu version.
$ pip install urduhack[tf-gpu]
Fantastic documentation is available at https://urduhack.readthedocs.io/
|Installation||How to install Urduhack and download models|
|Quickstart||New to Urduhack? Here's everything you need to know!|
|API Reference||The detailed reference for Urduhack's API.|
How to Contribute
- Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
- Write a test which shows that the bug was fixed or that the feature works as expected.
- Send a pull request and bug the maintainer until it gets merged and published. :)
Special thanks to everyone who contributed to getting the UrduHack to the current state.
Thank you to all our backers! 🙏 [Become a backer]
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]
📝 Copyright and license
Code released under the MIT License.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.