Skip to main content

Unicode normalization filters

Project description

Converts UTF-8 input to the desired UTF-8 in Unicode normalization form.

Read about the Unicode Normalization Forms!

Usage

There are five executables included, that all have the exact same usage and arguments:

  • unormalize
  • nfc
  • nfd
  • nfkc
  • nfkd

You may either redirect or pipe input into unormalize (and its buddies), or provide filenames as arguments.

Options

-f FORM/--form=FORM
Selects the normalization form: one of NFC, NFD, NFKC, or NFKD. The equivalently named executables imply their respective normalization form; unormalize is equivilent to nfk without the --form arugment.
-i EXTENSION/--in-place EXTENSION
Filenames must be specified as arguments. If so, this opens them, and converts them into the desired normalization form, in place. EXTENSION is the extension given to back-ups of the original files.

Examples

Convert clipboard contents to NFC (macOS):

$ pbpaste | nfc | pbcopy

Convert a file, in-place, to NFKD:

$ nfkd --in-place=.bak file.txt && rm file.txt.bak

Convert circled, variants, and half-widths to their compatible forms:

$ echo 'ℍ①カ' | nfkc
H1カ

License

© 2015, 2017 Eddie Antonio Santos. MIT Licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for unormalize, version 2020.7.17
Filename, size File type Python version Upload date Hashes
Filename, size unormalize-2020.7.17-py2-none-any.whl (4.6 kB) File type Wheel Python version py2 Upload date Hashes View
Filename, size unormalize-2020.7.17-py3-none-any.whl (4.6 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size unormalize-2020.7.17.tar.gz (4.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page