Skip to main content

The hyphenation library of OpenOffice and FireFox wrapped for Python

Project description

PyHyphen is a wrapper around the high quality hyphenation library hyphen-2.4 (May 2008) that ships with and Mozilla products. Hence, all dictionaries compatible with OpenOffice can be used.

This distribution of PyHyphen runs on Windows and Linux with Python 2.4, 2.5 and 2.6. There is also a distribution for Python 3.0 (if not yet on the pypi, check out the svn repository).

The source distributions may include pre-compiled binary files of the extension module hnj containing the C library that does the ground work.

By default, the appropriate binary, if available, will be installed rather than compiling the C sources. Currently, the P2.x distribution ships with binaries for P2.4 through 2.6.

PyHyphen also contains textwrap2, an enhanced though backwards-compatible version of the standard Python module textwrap. Not very surprisingly, textwrap2 can hyphenate words when wrapping them.

New in Version 0.9:
  • removed the ‘inserted’ method from the hyphenator class as it is not used in practice
  • Added a binary for Python 2.6
  • in the Python 2.x-versionthe word to be hyphenated must now be unicode (utf-8 encoded strings raise TypeError). The restriction to unicode is safer and more 3.0-compliant. In the version for Python 3.0, word is of course a string.
  • fixed important bug in ‘pairs’ method that could cause a unicode error if ‘word’ was not encodable to the dictionary’s encoding. In the latter case, the new version returns an empty list (consistent with other cases where the word is not hyphenable).
  • the configuration script has been simplified and improved: it does not raise ImportError even if the package cannot be imported. This tolerance is needed to create a Debian package.

New in version 0.8:

  • upgraded to C library hyphen 2.4 (supports compound words and parameters for the minimum number of characters to be cut off by hyphenation) Note that this might require small code changes to existing applications.
  • an enhanced dictionary for en_US
  • support for Python 2.4-2.6 and 3.0
  • many small improvements under the hood

Code example:

>>> from hyphen import hyphenator
from hyphen.dictools import *
# Download and install some dictionaries in the default directory using the default
# repository, usually the OpenOffice website
for lang in ['de_DE', 'fr_FR', 'en_UK', 'hu_HU']:
    if not is_installed(lang): install(lang)
# Create some hyphenators
h_de = hyphenator('de_DE')
h_en = hyphenator(lmin = 3, rmin = 3) # the en_US dictionary is used by default!
h_hu = hyphenator('hu_HU')
# Now hyphenate some words. Note that under Python 3.0, words are of type string.
    print h_en.pairs(u'beautiful')
    [[u'beau', u'tiful'], [u'beauti', u'ful']]
    print h_en.wrap(u'beautiful', 6)
    [u'beau-', u'tiful']
    print h_en.wrap(u'beautiful', 7)
    [u'beauti-', u'ful']
>>> from textwrap2 import fill
print fill(u'very long text...', width = 40, use_hyphenator = h_en)

The PyHyphen’s Subversion repository is hosted at

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for PyHyphen, version 0.9
Filename, size File type Python version Upload date Hashes
Filename, size PyHyphen-0.9.tar.gz (152.7 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page