Skip to main content

Detects Unicode support of an interactive terminal

Project description

ucs-detect

Without any arguments,

$ ucs-detect

ucs-detect automatically tests the Unicode version and support level of a terminal emulator for Wide character, Emoji Zero Width Joiner (ZWJ) sequences, Emoji Variation Selector-16 (VS-16) sequences, and Zero-Width or combining characters by supported Language. A brief report is then printed to stdout.

video demonstration of running ucs-detect

Installation & Usage

To install or upgrade:

$ pip install -U ucs-detect

To use:

$ ucs-detect

To run a detailed test and store a yaml report to disk:

$ ucs-detect --save-yaml=data/my-terminal.yaml --limit-codepoints=5000 --limit-words=5000 --limit-errors=500

Test Results

More than twenty modern terminals for Windows, Linux, and Mac were tested, their results have been collected into this repository and a detailed summary is published at URL https://ucs-detect.readthedocs.io/results.html

An article describing the development of ucs-detect and summarizing the results for the 1.0.4 release of ucs-detect (November 2023) is published at https://www.jeffquast.com/post/ucs-detect-test-results/

Individual yaml data file reports for these terminals may also be inspected at the repository folder data, https://github.com/jquast/ucs-detect/tree/master/data

Please note that results will be shared with Terminal Emulator projects and this information may become out of date as they improve their support for Unicode. Please do not expect the maintainers of ucs-detect to update these data files. If you wish for this report to be corrected for any given Terminal, please feel free to submit a pull request with an update to the yaml data files.

Problem

Many East Asian languages contain Wide (W) or Fullwidth (F) characters, meaning that each character occupies 2 cells instead of 1. Further, many languages contain special combining characters that are “zero width”, meaning they do not occupy any cells, only modifying the previous one as a “combining” character. Finally, there are “Zero Width Joiner” and “Variation Selector-16” characters that are used in sequence for Emoji characters.

A terminal application that displays these characters may have trouble determining how it will be displayed to the end-user. This problem happens often, because the Unicode Consortium releases new versions of the Unicode Standard periodically, but the source code of libraries and applications are not updated at the same time, or at all!

Finally, a terminal emulator may have varying levels of support. For example, at time of this writing, Microsoft’s Terminal.exe supports up to Unicode 15.0 for Wide characters, is missing support for 27 characters of Unicode 13.0, has no support for Emoji ZWJ, fully supports all VS-16 sequences, but fails to correctly categorize many Zero-Width for 88 or more of the world’s languages.

Solution

The most important factor is to determine whether the Terminal Emulator complies with the Specification published by the python wcwidth library.

This program, ucs-detect, is able to automatically detect the version and feature level support of unicode that the connecting Terminal supports for WIDE, ZERO, ZWJ, and VS-16 characters.

How it works

The solution in this program is the use of the Query Cursor Position terminal sequence, which asks, “where is the cursor?”. This is a hidden sequence that a Terminal Emulator automatically responds to.

By use of this sequence, and the data tables of the wcwidth library, we can test for compliance of the python wcwidth library Specification.

The use of Query Cursor Position is inspired by the resize(1) program distributed with X11, which determines the terminal size over transports that are not capable of communicating by signal or forwarding by environment value, such as over a serial line. resize(1) simply moves to (999, 999) then asks, “where is my cursor?” and the response is understood to be the terminal size.

UNICODE_VERSION (legacy)

Versions of ucs-detect prior to 1.0 served only a single purpose, to export an sh-compatible line for export of UNICODE_VERSION. To continue this purpose, use --shell --quick, for example:

$ ucs-detect --shell --quick
UNICODE_VERSION=15.0.0; export UNICODE_VERSION

It is designed to be used interactively:

$ eval "$(ucs-detect --quick --shell)"
$ echo $UNICODE_VERSION
15.0.0

The environment variable, UNICODE_VERSION is currently used by the python wcwidth library, which contains every past unicode table version, to determine how dependent python programs, such as IPython render wide and zero-width characters.

History

  • 1.0.7 (2024-01-06): Add python 3.10 compatibility for yaml file save and update wcwidth requirement to 0.2.13.

  • 1.0.6 (2023-12-15): Distribution fix for UDHR data and bugfix for python 3.8 through 3.11. ucs-detect Welcomes @GalaxySnail as a new project contributor.

  • 1.0.5 (2023-11-13): Set minimum wcwidth release version requirement.

  • 1.0.4 (2023-11-13): Add support for Emoji with VS-16 and more complete testing. Published test results.

  • 1.0.3 (2023-10-28): Drop python 2 support. Add more advanced testing. Changes default behavior when called without arguments, use ucs-detect --quick --shell to use the new release with matching previous release behavior.

  • 0.0.4 (2020-06-20): Initial releases and bugfixes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ucs_detect-1.0.7.tar.gz (646.2 kB view details)

Uploaded Source

Built Distribution

ucs_detect-1.0.7-py2.py3-none-any.whl (686.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ucs_detect-1.0.7.tar.gz.

File metadata

  • Download URL: ucs_detect-1.0.7.tar.gz
  • Upload date:
  • Size: 646.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for ucs_detect-1.0.7.tar.gz
Algorithm Hash digest
SHA256 293c8e0137d4011c496920a7423445ff8689e9fbc44ba156fd0482ab1d9aaf9b
MD5 6e5ebed0e97ba443e2b852f039fc69ec
BLAKE2b-256 01c762969c536c11c9569ecc6701ee1888f77e81e4e679e33caa230d519cecb9

See more details on using hashes here.

File details

Details for the file ucs_detect-1.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: ucs_detect-1.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 686.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for ucs_detect-1.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 797c4a5aa665b1a9ae06d819811ddc42ac8b64c101c11ca4bc34eced193e0c58
MD5 f4d2a1d7ea077081bd5ca47fdbf96798
BLAKE2b-256 2ca99ad8033bf81ac7aec592544906e88bdc05242ffb0ebd3cd1d1b90d7c445f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page