A Python slugify application that also handles Unicode
Project description
Python Slugify
A Python slugify application that handles unicode.
Overview
Best attempt to create slugs from unicode strings while keeping it DRY.
Notice
This module, by default installs and uses text-unidecode (GPL & Perl Artistic) for its decoding needs.
However, there is an alternative decoding package called Unidecode (GPL). It can be installed as python-slugify[unidecode]
for those who prefer it. Unidecode
is believed to be more advanced.
Official
Support Matrix
Python | Slugify |
---|---|
>= 2.7 < 3.6 |
< 5.0.0 |
>= 3.6 < 3.7 |
>= 5.0.0 < 7.0.0 |
>= 3.7 |
>= 7.0.0 |
How to install
easy_install python-slugify |OR| easy_install python-slugify[unidecode]
-- OR --
pip install python-slugify |OR| pip install python-slugify[unidecode]
Options
def slugify(
text: str,
entities: bool = True,
decimal: bool = True,
hexadecimal: bool = True,
max_length: int = 0,
word_boundary: bool = False,
separator: str = DEFAULT_SEPARATOR,
save_order: bool = False,
stopwords: Iterable[str] = (),
regex_pattern: str | None = None,
lowercase: bool = True,
replacements: Iterable[Iterable[str]] = (),
allow_unicode: bool = False,
) -> str:
"""
Make a slug from the given text.
:param text (str): initial text
:param entities (bool): converts html entities to unicode (foo & bar -> foo-bar)
:param decimal (bool): converts html decimal to unicode (Ž -> Ž -> z)
:param hexadecimal (bool): converts html hexadecimal to unicode (Ž -> Ž -> z)
:param max_length (int): output string length
:param word_boundary (bool): truncates to end of full words (length may be shorter than max_length)
:param save_order (bool): if parameter is True and max_length > 0 return whole words in the initial order
:param separator (str): separator between words
:param stopwords (iterable): words to discount
:param regex_pattern (str): regex pattern for disallowed characters
:param lowercase (bool): activate case sensitivity by setting it to False
:param replacements (iterable): list of replacement rules e.g. [['|', 'or'], ['%', 'percent']]
:param allow_unicode (bool): allow unicode characters
:return (str): slugify text
"""
How to use
from slugify import slugify
txt = "This is a test ---"
r = slugify(txt)
self.assertEqual(r, "this-is-a-test")
txt = '影師嗎'
r = slugify(txt)
self.assertEqual(r, "ying-shi-ma")
txt = '影師嗎'
r = slugify(txt, allow_unicode=True)
self.assertEqual(r, "影師嗎")
txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEqual(r, "c-est-deja-l-ete")
txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEqual(r, "nin-hao-wo-shi-zhong-guo-ren")
txt = 'Компьютер'
r = slugify(txt)
self.assertEqual(r, "kompiuter")
txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=9)
self.assertEqual(r, "jaja-lol")
txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=15, word_boundary=True)
self.assertEqual(r, "jaja-lol-a")
txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=20, word_boundary=True, separator=".")
self.assertEqual(r, "jaja.lol.mememeoo.a")
txt = 'one two three four five'
r = slugify(txt, max_length=13, word_boundary=True, save_order=True)
self.assertEqual(r, "one-two-three")
txt = 'the quick brown fox jumps over the lazy dog'
r = slugify(txt, stopwords=['the'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')
txt = 'the quick brown fox jumps over the lazy dog in a hurry'
r = slugify(txt, stopwords=['the', 'in', 'a', 'hurry'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')
txt = 'thIs Has a stopword Stopword'
r = slugify(txt, stopwords=['Stopword'], lowercase=False)
self.assertEqual(r, 'thIs-Has-a-stopword')
txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, regex_pattern=regex_pattern)
self.assertEqual(r, "___this-is-a-test___")
txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, separator='_', regex_pattern=regex_pattern)
self.assertNotEqual(r, "_this_is_a_test_")
txt = '10 | 20 %'
r = slugify(txt, replacements=[['|', 'or'], ['%', 'percent']])
self.assertEqual(r, "10-or-20-percent")
txt = 'ÜBER Über German Umlaut'
r = slugify(txt, replacements=[['Ü', 'UE'], ['ü', 'ue']])
self.assertEqual(r, "ueber-ueber-german-umlaut")
txt = 'i love 🦄'
r = slugify(txt, allow_unicode=True)
self.assertEqual(r, "i-love")
txt = 'i love 🦄'
r = slugify(txt, allow_unicode=True, regex_pattern=r'[^🦄]+')
self.assertEqual(r, "🦄")
For more examples, have a look at the test.py file.
Command Line Options
With the package, a command line tool called slugify
is also installed.
It allows convenient command line access to all the features the slugify
function supports. Call it with -h
for help.
The command can take its input directly on the command line or from STDIN (when the --stdin
flag is passed):
$ echo "Taking input from STDIN" | slugify --stdin
taking-input-from-stdin
$ slugify taking input from the command line
taking-input-from-the-command-line
Please note that when a multi-valued option such as --stopwords
or --replacements
is passed, you need to use --
as separator before you start with the input:
$ slugify --stopwords the in a hurry -- the quick brown fox jumps over the lazy dog in a hurry
quick-brown-fox-jumps-over-lazy-dog
Running the tests
To run the tests against the current environment:
python test.py
Contribution
Please read the (wiki) page prior to raising any PRs.
License
Released under a (MIT) license.
Notes on GPL dependencies
Though the dependencies may be GPL licensed, python-slugify
itself is not considered a derivative work and will remain under the MIT license.
If you wish to avoid installation of any GPL licensed packages, please note that the default dependency text-unidecode
explicitly lets you choose to use the Artistic License instead. Use without concern.
Version
X.Y.Z Version
`MAJOR` version -- when you make incompatible API changes,
`MINOR` version -- when you add functionality in a backwards-compatible manner, and
`PATCH` version -- when you make backwards-compatible bug fixes.
Sponsors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file python-slugify-8.0.4.tar.gz
.
File metadata
- Download URL: python-slugify-8.0.4.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/6.8.0 pkginfo/1.9.6 requests/2.31.0 requests-toolbelt/1.0.0 tqdm/4.66.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59202371d1d05b54a9e7720c5e038f928f45daaffe41dd10822f3907b937c856 |
|
MD5 | 5fda51099fc7161fc0869d4aac7ccb27 |
|
BLAKE2b-256 | 87c75e1547c44e31da50a460df93af11a535ace568ef89d7a811069ead340c4a |
File details
Details for the file python_slugify-8.0.4-py2.py3-none-any.whl
.
File metadata
- Download URL: python_slugify-8.0.4-py2.py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/6.8.0 pkginfo/1.9.6 requests/2.31.0 requests-toolbelt/1.0.0 tqdm/4.66.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 276540b79961052b66b7d116620b36518847f52d5fd9e3a70164fc8c50faa6b8 |
|
MD5 | 35c4a1ca7479cbad4b16f285e11986b6 |
|
BLAKE2b-256 | a46202da182e544a51a5c3ccf4b03ab79df279f9c60c5e82d5e8bec7ca26ac11 |