Functions for working with Vietnamese text
Project description
Functions for working with Vietnamese text
Installation
To get the latest stable release from PyPi
pip install viet_text_tools
Usage
normalize_diacritics()
You can normalize diacritics for a Vietnamese word. The return value is in composed (NFC) form
normalize_diacritics('nghìên') == 'nghiền'
Pass new_style=True to use new style tone placement
normalize_diacritics('thủy', new_style=True) == 'thuỷ'
Pass decomposed=True to return a string in decomposed (NFD) form
len(normalize_diacritics('thủy')) == 4
len(normalize_diacritics('thủy', decomposed=True)) == 5
vietnamese_sort_key()
A key function for use with sorted() to sort Vietnamese text with the correct collation order
words = ['anh', 'ba', 'áo', 'cắt', 'cá', 'cả']
sorted(words) == ['anh', 'ba', 'cá', 'cả', 'cắt', 'áo']
sorted(words, key=vietnamese_sort_key) == ['anh', 'áo', 'ba', 'cả', 'cá', 'cắt']
vietnamese_case_insensitive_sort_key()
Same as vietnamese_sort_key() but case-insensitive.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
viet_text_tools-0.1.6.tar.gz
(4.3 kB
view details)
Built Distribution
File details
Details for the file viet_text_tools-0.1.6.tar.gz
.
File metadata
- Download URL: viet_text_tools-0.1.6.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.10 CPython/3.8.2 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ea821e9a96b2f121f017ba95569dd380fd57c9f65a9695f531137b36d392730 |
|
MD5 | 73619d003e118e72f5ef98b8fe10f84f |
|
BLAKE2b-256 | 53025045c89a9a12de8f3047fae7a50abecaa319ce2eb2fba425f000cce457ee |
File details
Details for the file viet_text_tools-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: viet_text_tools-0.1.6-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.10 CPython/3.8.2 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6fcc7d180b59cbb8a8c245ad31fdc5af41763726dd1e82cc5105d13533d5072 |
|
MD5 | 9b2e796aac6ff07774f0c5696e3d9704 |
|
BLAKE2b-256 | fd19fc232ac11a80a322159ac289d0d195064358a85a287ab986a7b5e791e2e8 |