Skip to main content

Normalizes files or standard input using a Unicode normalization form.

Project description


unormalize [-f FORM] [-i EXT] [files...]
nfc [-i EXTENSION] [files...]
nfd [-i EXTENSION] [files...]
nfkc [-i EXTENSION] [files...]
nfkd [-i EXTENSION] [files...]


-i EXTENSION -- Modify files inplace, saving back-ups with EXTENSION
-f FORM-- normalization form
Author: Eddie Antonio Santos
License: MIT
Description: **************************************************
unormalize - Filters that do Unicode normalization

Converts UTF-8 input to the desired UTF-8 in Unicode normalization form.

Read about the `Unicode Normalization Forms`_!


There are five executables included, that all have the exact same usage and

- unormalize
- nfc
- nfd
- nfkc
- nfkd

You may either redirect or pipe input into `unormalize` (and its buddies), or
provide filenames as arguments.


``-f FORM``/``--form=FORM``
Selects the normalization form: one of NFC, NFD, NFKC, or NFKD. The
equivalently named executables imply their respective normalization form;
``unormalize`` is equivilent to ``nfk`` without the ``--form`` arugment.

``-i EXTENSION``/``--in-place EXTENSION``
Filenames **must** be specified as arguments. If so, this opens them, and
converts them into the desired normalization form, in place. ``EXTENSION`` is
the extension given to back-ups of the original files.


Convert clipboard contents to NFC (OS X)::

$ pbpaste | nfc | pbcopy

Convert a file, in-place, to NFKD::

$ nfkd --in-place=.bak file.txt && rm file.txt.bak

Convert circled, variants, and half-widths to their compatible forms::

$ echo 'ℍ①カ' | nfkc


© 2015 Eddie Antonio Santos. MIT Licensed.

.. _`Unicode Normalization Forms`:

Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Environment :: Console
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Text Processing
Classifier: Topic :: Utilities

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
unormalize-0.2.0-py2.py3-none-any.whl (5.5 kB) Copy SHA256 hash SHA256 Wheel py2.py3
unormalize-0.2.0.tar.gz (4.6 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page