Skip to main content

A simple but working Finnish language hyphenator.

Project description

By Pyry Kontio a.k.a Drasa (Drasa@IRCnet, pyry.kontio@drasa.eu)

A very simple hyphenator. Hypenates Finnish text with Unicode soft hyphens. (U+00AD) Allows to set margins for words so that they won’t break right at start or end. For example, it’d be a bit silly to break a word like ‘erikoinen’ at ‘e-rikoinen’. With default margin of 1, it breaks like ‘eri-koinen’. If a word contains taboo_characters, it won’t get hyphenated.

Usage: as standalone script:

hypenate_finnish.py 2 joo joo no testaillaan täs vaa

OR as a Python module:

from hyphenate_finnish import hyphenate hyphenate(“some text but <html> isn’t gonna get hyphenated!”, margin=1, taboo_chars=[‘<’, ‘>’])

It’s that simple. By the way, written with Py3k, but it seems to work with 2.7 too.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyphenate_finnish-1.0.2.tar.gz (5.8 kB view details)

Uploaded Source

File details

Details for the file hyphenate_finnish-1.0.2.tar.gz.

File metadata

File hashes

Hashes for hyphenate_finnish-1.0.2.tar.gz
Algorithm Hash digest
SHA256 76fe348dd89f61e6392be1f7758c4148cb84c4c8674c73bfa1385d0ccd524c69
MD5 12ea105d5d8ae911e589b30a1f882e9a
BLAKE2b-256 30a1151e72e9b2a6ce42576b66431cd2aafc1872fc4cd200e42eb1264bae1b1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page