Replacement for shlex (that works with unicode) for Python 2.X.
Inspired by ordereddict, this is a packaging of an improved shlex module for Python 2 that handles Unicode properly.
Shlex is “”“A lexical analyzer class for simple shell-like syntaxes.”“”
If you’ve found your way here, you probably already know that the standard shlex doesn’t handle Unicode prior to Python 3 (see bug 1170 for details). Since Python 2.7.3 however, it accepts unicode objects. Sadly, it still does not handle non-ascii chars:
>>> import sys, shlex >>> sys.version '2.7.5+ ...' >>> shlex.split(u'Hello world') ['Hello', 'world'] >>> shlex.split(u'café') Traceback (most recent call last): File "<input>", line 1, in <module> File "/usr/lib/python2.7/shlex.py", line 275, in split lex = shlex(s, posix=posix) File "/usr/lib/python2.7/shlex.py", line 25, in __init__ instream = StringIO(instream) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
This module does handle unicode objects or byte strings:
>>> import ushlex as shlex >>> shlex.split(u'café') [u'caf\xe9'] >>> shlex.split(u'echo "☺ ☕ ♫"') [u'echo', u'\u263a \u2615 \u266b'] >>> from ushlex import split as shplit >>> shplit('echo "hello there"') ['echo', 'hello there']
I found these release notes inside:
# Module and documentation by Eric S. Raymond, 21 Dec 1998 # Input stacking and error message cleanup added by ESR, March 2000 # push_source() and pop_source() made explicit by ESR, January 2001. # Posix compliance, split(), string arguments, and # iterator interface by Gustavo Niemeyer, April 2003. # Modified to support Unicode by Colin Walters, Dec 2007
Packaging-only bugs may be submitted to bitbucket. Do not enter bugs for ushlex itself, as the packager is not the author.