Skip to main content

Easy conversion between Unicode characters, numeric HTML entities, and named HTML entities.

Project description

When reading HTML, named entities are often neater and easier to comprehend than numeric entities, Unicode (or other charset) characters, or a mixture of all of the above. Because they fall within the ASCII range, entities are also much safer to use in multiple contexts than Unicode and its various encodings (UTF-8 and such).

This module helps convert from numerical HTML entites and Unicode characters that fall outside the normal ASCII range into named entities. Or, if you prefer, it will help you go the other way, mapping all entities into Unicode. And if you decide you want entities of the counting type, it will even help you go numeric.

Usage

Python 2:

from namedentities import *

u = u'both em\u2014and–dashes…'

print "named:  ", repr(named_entities(u))
print "numeric:", repr(numeric_entities(u))
print "unicode:", repr(unicode_entities(u))

yields:

named:   'both em—and–dashes…'
numeric: 'both em—and–dashes…'
unicode: u'both em\u2014and\u2013dashes\u2026'

You can do just about the same thing in Python 3, but you have to use a print function rather than a print statement, and prior to 3.3, you have to skip the u prefix that in Python 2 marks string literals as being Unicode literals. Python 3.3, however, allows the u marker as an optional feature; it doesn’t really do anything specific, because all Python 3 strings are Unicode–but it sure helps with cross-version code compatibility. (You can use the six cross-version compatibility library, as the tests do.)

Recent Changes

  • The unescape(text) API changes all entities into Unicode characters. While long present, is now available for easy external consumption. It has an alias, unicode_entities(text) for parallelism with the other APIs.

  • Repackaged first as a Python package, rather than independent modules. Then, given my growing confidence in managing cross-version packages, the Python 2 and Python 3 implementation backends have been merged into a single backend.

  • Now successfully packaged for, and tests against, against Python 2.6, 2.7, and 3.3, as well as PyPy 2.0.2 (based on 2.7.3). Automated multi-version testing managed with the wonderful pytest and tox.

  • Should also work under Python 2.5 and 3.2 releases, and PyPy 1.9, but those have been removed from “official support” because they are no longer supported in my testing environment. Time to upgrade!

Notes

  • Doesn’t attempt to encode <, >, or & (or their numerical equivalents) to avoid interfering with HTML escaping.

  • This is basically a packaging of Ian Beck’s work. Thank you, Ian!

Installation

pip install -U namedentities

To easy_install under a specific Python version (3.3 in this example):

python3.3 -m easy_install --upgrade namedentities

(You may need to prefix these with “sudo “ to authorize installation.)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

namedentities-1.4.0.zip (10.0 kB view details)

Uploaded Source

namedentities-1.4.0.tar.gz (4.7 kB view details)

Uploaded Source

File details

Details for the file namedentities-1.4.0.zip.

File metadata

  • Download URL: namedentities-1.4.0.zip
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for namedentities-1.4.0.zip
Algorithm Hash digest
SHA256 8eb63e0a58caf5fd454b27ed6f92349bf2def5677fd53a9aef9cc0f512e1b594
MD5 5de4329c18ccbb76d75af25444d6fd27
BLAKE2b-256 d80aa551c88717d12697d21eacb02e16cc49e08bb077d6cfb0c6e405872628a1

See more details on using hashes here.

File details

Details for the file namedentities-1.4.0.tar.gz.

File metadata

File hashes

Hashes for namedentities-1.4.0.tar.gz
Algorithm Hash digest
SHA256 1311809b07cb58a7c8c4700697465b37f3101698e4ffe3e3281e5f4c93ff4d26
MD5 f3e84ba78d5e0e9d8e923b1e03500d05
BLAKE2b-256 426aca66318e9cecb609e9c65728333f7ae72da5fa4000364038308d2bffbfbd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page