Functions for working with Vietnamese text
Project description
Functions for working with Vietnamese text
Installation
To get the latest stable release from PyPi
pip install viet_text_tools
Usage
normalize_diacritics()
You can normalize diacritics for a Vietnamese word. The return value is in composed (NFC) form
normalize_diacritics('nghìên') == 'nghiền'
Pass new_style=True to use new style tone placement
normalize_diacritics('thủy', new_style=True) == 'thuỷ'
Pass decomposed=True to return a string in decomposed (NFD) form
len(normalize_diacritics('thủy')) == 4
len(normalize_diacritics('thủy', decomposed=True)) == 5
vietnamese_sort_key()
A key function for use with sorted() to sort Vietnamese text with the correct collation order
words = ['anh', 'ba', 'áo', 'cắt', 'cá', 'cả']
sorted(words) == ['anh', 'ba', 'cá', 'cả', 'cắt', 'áo']
sorted(words, key=vietnamese_sort_key) == ['anh', 'áo', 'ba', 'cả', 'cá', 'cắt']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for viet_text_tools-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eac3f6d2256dfa56e5f44621fdd705ac262736bbb9555ad0a4b72c5222420f21 |
|
MD5 | 3ab5796db6c6c3f1f2ddec242c6b57f3 |
|
BLAKE2b-256 | 74f61e63336e9cbc8aed78663872ab78d17ca57960593af5213a228b05a369f5 |