Skip to main content

A slug generator that turns strings into unicode slugs, and enables replacement for common latin letters into ascii representations.

Project description

[![Build Status](https://travis-ci.org/eminbugrasaral/unicode-slugify-latin.svg?branch=master)](https://travis-ci.org/eminbugrasaral/unicode-slugify-latin)

Links

PyPi: https://pypi.python.org/pypi/unicode-slugify-latin

Github: https://github.com/eminbugrasaral/unicode-slugify-latin

# Unicode Slugify (with Latin Hack)

Unicode Slugify is a slugifier that generates unicode slugs. It was originally used in the Firefox Add-ons web site to generate slugs for add-ons and add-on collections. Many of these add-ons and collections had unicode characters and required more than simple transliteration.

## Install

pip install unicode-slugify-latin

## Usage

>>> import slugify
>>> slugify.slugify(u'Bän...g (bang)')
u'bäng-bang'

## Latin Hack

  • Replaces special Latin chars with similar ascii representations.

  • Problem: I want users who speak Latin languages with English keyboards to be able to search through my Latin strings.

  • Solution: Slugify that Latin string by enabling Latin replacement, and match this string with the slugified search word.

  • Example: Strore “Sabancı Üniversitesi” as “sabanci-universitesi” and then users will be able to search with any combination like “Sabanci”, “Sabancı” and “SABANCI”.

  • Note: Do not forget to slugify both strings with replace_latin=True

## Example

>>> from slugify import slugify
>>> string_without_latin_letters = slugify(u'ıspanaklı boğaz turşusu', replace_latin=True)
u'ispanakli-bogaz-tursusu'
>>> slugify(u'Ispanakli Bogaz Tursusu') == string_without_latin_letters
True
>>> u'Bogazici'.lower() in slugify(u'boğaziçi', replace_latin=True)
True
>>> slugify(u'çiçek', replace_turkish=True) in slugify(u'ÇİÇEK', replace_latin=True)
True
>>> u'cicek' in slugify(u'ÇİÇEK', replace_latin=True)
True

## List of common latin letters to be replaced

  • ı, ì, í, î, ï -> i

  • İ, Ì, Í, Î, Ï -> I

  • ö, ó, ò, ô, õ, ø -> o

  • Ö, Ò, Ó, Ô, Õ, Ø -> O

  • ü, ù, ú, û -> u

  • Ü, Ù, Ú, Û -> U

  • à, á, â, ã, ä, å -> a

  • À, Á, Â, Ã, Ä, Å -> A

  • æ -> ae

  • Æ -> AE

  • è, é, ê, ë -> e

  • È, É, Ê, Ë -> E

  • ñ -> n

  • Ñ -> N

  • ý, ÿ -> y

  • Ý, Ÿ -> Y

  • ş -> s

  • Ş -> S

  • ç -> c

  • Ç -> C

  • ğ -> g

  • Ğ -> G

## New parameters after this fork

  • replace_latin: Replace common Latin letters to be replaced with similar ascii representation.

  • unicode_pairs: You can give a dictionary of unicode characters with their replacement values. Like: {u’xe9’, ‘e’} - é will be replaced with e

## Sponsors

## Contact

Project details


Release history Release notifications | RSS feed

This version

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unicode-slugify-latin-0.2.tar.gz (5.9 kB view details)

Uploaded Source

File details

Details for the file unicode-slugify-latin-0.2.tar.gz.

File metadata

File hashes

Hashes for unicode-slugify-latin-0.2.tar.gz
Algorithm Hash digest
SHA256 7dc3fcefd5df2f97cfdb12e8227dd4beffd162e121b8cc1cc08e9e0acc028134
MD5 c2f4f34e9ea305fb3c8c3acce9e95a35
BLAKE2b-256 c1da0f6944188b7226c5959573d95fdf43ec3648264ffe375ea558a8e1c955c5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page