Skip to main content

A Python slugify application that also handles Unicode

Project description

Python Slugify

A Python slugify application that handles unicode.

status-image version-image coverage-image

Overview

Best attempt to create slugs from unicode strings while keeping it DRY.

Notice

This module, by default installs and uses text-unidecode (GPL & Perl Artistic) for its decoding needs.

However, there is an alternative decoding package called Unidecode (GPL). It can be installed as python-slugify[unidecode] for those who prefer it.

Python Versions & Official Support

  • Python 2.7 <-> python-slugify < 5.0.0
  • Python 3.6+ <-> python-slugify >= 5.0.0
  • Python 3.7+ <-> python-slugify >= 7.0.0

How to install

easy_install python-slugify |OR| easy_install python-slugify[unidecode]
-- OR --
pip install python-slugify |OR| pip install python-slugify[unidecode]

Options

def slugify(
    text,
    entities=True,
    decimal=True,
    hexadecimal=True,
    max_length=0,
    word_boundary=False,
    separator='-',
    save_order=False,
    stopwords=(),
    regex_pattern=None,
    lowercase=True,
    replacements=(),
    allow_unicode=False
  ):
  """
  Make a slug from the given text.
  :param text (str): initial text
  :param entities (bool): converts html entities to unicode (foo &amp; bar -> foo-bar)
  :param decimal (bool): converts html decimal to unicode (&#381; -> Ž -> z)
  :param hexadecimal (bool): converts html hexadecimal to unicode (&#x17D; -> Ž -> z)
  :param max_length (int): output string length
  :param word_boundary (bool): truncates to end of full words (length may be shorter than max_length)
  :param save_order (bool): if parameter is True and max_length > 0 return whole words in the initial order
  :param separator (str): separator between words
  :param stopwords (iterable): words to discount
  :param regex_pattern (str): regex pattern for disallowed characters
  :param lowercase (bool): activate case sensitivity by setting it to False
  :param replacements (iterable): list of replacement rules e.g. [['|', 'or'], ['%', 'percent']]
  :param allow_unicode (bool): allow unicode characters
  :return (str): slugify text
  """

How to use

from slugify import slugify

txt = "This is a test ---"
r = slugify(txt)
self.assertEqual(r, "this-is-a-test")

txt = '影師嗎'
r = slugify(txt)
self.assertEqual(r, "ying-shi-ma")

txt = '影師嗎'
r = slugify(txt, allow_unicode=True)
self.assertEqual(r, "影師嗎")

txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEqual(r, "c-est-deja-l-ete")

txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEqual(r, "nin-hao-wo-shi-zhong-guo-ren")

txt = 'Компьютер'
r = slugify(txt)
self.assertEqual(r, "kompiuter")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=9)
self.assertEqual(r, "jaja-lol")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=15, word_boundary=True)
self.assertEqual(r, "jaja-lol-a")

txt = 'jaja---lol-méméméoo--a'
r = slugify(txt, max_length=20, word_boundary=True, separator=".")
self.assertEqual(r, "jaja.lol.mememeoo.a")

txt = 'one two three four five'
r = slugify(txt, max_length=13, word_boundary=True, save_order=True)
self.assertEqual(r, "one-two-three")

txt = 'the quick brown fox jumps over the lazy dog'
r = slugify(txt, stopwords=['the'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'the quick brown fox jumps over the lazy dog in a hurry'
r = slugify(txt, stopwords=['the', 'in', 'a', 'hurry'])
self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog')

txt = 'thIs Has a stopword Stopword'
r = slugify(txt, stopwords=['Stopword'], lowercase=False)
self.assertEqual(r, 'thIs-Has-a-stopword')

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, regex_pattern=regex_pattern)
self.assertEqual(r, "___this-is-a-test___")

txt = "___This is a test___"
regex_pattern = r'[^-a-z0-9_]+'
r = slugify(txt, separator='_', regex_pattern=regex_pattern)
self.assertNotEqual(r, "_this_is_a_test_")

txt = '10 | 20 %'
r = slugify(txt, replacements=[['|', 'or'], ['%', 'percent']])
self.assertEqual(r, "10-or-20-percent")

txt = 'ÜBER Über German Umlaut'
r = slugify(txt, replacements=[['Ü', 'UE'], ['ü', 'ue']])
self.assertEqual(r, "ueber-ueber-german-umlaut")

txt = 'i love 🦄'
r = slugify(txt, allow_unicode=True)
self.assertEqual(r, "i-love")

txt = 'i love 🦄'
r = slugify(txt, allow_unicode=True, regex_pattern=r'[^🦄]+')
self.assertEqual(r, "🦄")

For more examples, have a look at the test.py file.

Command Line Options

With the package, a command line tool called slugify is also installed.

It allows convenient command line access to all the features the slugify function supports. Call it with -h for help.

The command can take its input directly on the command line or from STDIN (when the --stdin flag is passed):

$ echo "Taking input from STDIN" | slugify --stdin
taking-input-from-stdin
$ slugify taking input from the command line
taking-input-from-the-command-line

Please note that when a multi-valued option such as --stopwords or --replacements is passed, you need to use -- as separator before you start with the input:

$ slugify --stopwords the in a hurry -- the quick brown fox jumps over the lazy dog in a hurry
quick-brown-fox-jumps-over-lazy-dog

Running the tests

To run the tests against the current environment:

python test.py

Contribution

Please read the (wiki) page prior to raising any PRs.

License

Released under a (MIT) license.

Version

X.Y.Z Version

`MAJOR` version -- when you make incompatible API changes,
`MINOR` version -- when you add functionality in a backwards-compatible manner, and
`PATCH` version -- when you make backwards-compatible bug fixes.

Sponsors

Neekware Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-slugify-7.0.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_slugify-7.0.0-py2.py3-none-any.whl (9.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file python-slugify-7.0.0.tar.gz.

File metadata

  • Download URL: python-slugify-7.0.0.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/5.0.0 pkginfo/1.8.3 requests/2.28.1 requests-toolbelt/0.10.1 tqdm/4.64.1 CPython/3.10.8

File hashes

Hashes for python-slugify-7.0.0.tar.gz
Algorithm Hash digest
SHA256 7a0f21a39fa6c1c4bf2e5984c9b9ae944483fd10b54804cb0e23a3ccd4954f0b
MD5 bfcad17b81f3ddb1c0d54b0ae2f8f197
BLAKE2b-256 7cd38e523f41fc3f8d3cc58d2284bb49c1961caefc1a434599ff410ddde38f82

See more details on using hashes here.

File details

Details for the file python_slugify-7.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: python_slugify-7.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/5.0.0 pkginfo/1.8.3 requests/2.28.1 requests-toolbelt/0.10.1 tqdm/4.64.1 CPython/3.10.8

File hashes

Hashes for python_slugify-7.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 003aee64f9fd955d111549f96c4b58a3f40b9319383c70fad6277a4974bbf570
MD5 803f2b2666f1f9b9cf8f609fa08d8f05
BLAKE2b-256 6365d0d7c085964fdf0cb294299663b407c38e2c8e8dd13bafcf5681798c12db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page