Skip to main content

Hyper fast hyphenation for Python

Project description

HyperHyphen

HyperHyphen is a Python package that provides hyper fast word hyphenation. It supports multiple languages and allows for easy integration into Python applications.

This is not a feature complete implementation of the Hyphenator library. There are better libraries for hyphenation, such as PyHyphen or Pyphen.

This library will only suggest hyphenation points, without consideration of word modification or language-specific rules. You don't need to worry about irregular hyphenation and word-modification if the line-breaking / wrapping algorithm can pick from many other hyphenation points instead. This library hyphenation loop is completely written in C to ensure that plenty of such points can be suggested fast.

Installation

You can install HyperHyphen using pip:

pip install hyperhyphen

Usage

Looking at your code, I can see the package supports multiple output modes. Here are usage examples to add to your README.md:

Usage

Basic Usage

from hyperhyphen import Hyphenator

# Create a hyphenator for English (US)
h = Hyphenator(language="en_US")

# Hyphenate text (default "str" mode)
text = "reconciliation microprocessing"
result = h(text)
print(result)
# Output: ['recon', 'cil', 'i', 'a', 'tion', ' ', 'micro', 'pro', 'cess', 'ing']

Different Output Modes

HyperHyphen supports four different output modes:

String Mode ("str") - Default

Returns a list of hyphenated word parts and whitespace segments:

h = Hyphenator(mode="str", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: ['The', ' ', 'inter', 'na', 'tion', 'al', 'iza', 'tion', ' ', 'commit', 'tee', ' ', 'discussed', ' ', "'telecom", 'mu', 'ni', 'ca', 'tions', ' ', 'infra', 'struc', 'ture', ' ', 'modern', 'iza', "tion,'", ' ', 'but', ' ', 'extra', 'or', 'di', 'nary', ' ', 'circum', 'stances', ' ', 'required', ' ', 'unprece', 'dented', ' ', 'orga', 'ni', 'za', 'tional', ' ', 'trans', 'for', 'ma', 'tions.']

Raw Mode ("raw")

Returns hyphenated words with = separators, preserving original whitespace structure:

h = Hyphenator(mode="raw", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: "the\\ninter=na=tion=al=iza=tion\\ncommit=tee\\ndiscussed\\n\'telecom=mu=ni=ca=tions\\ninfra=struc=ture\\nmodern=iza=tion,\'\\nbut\\nextra=or=di=nary\\ncircum=stances\\nrequired\\nunprece=dented\\norga=ni=za=tional\\ntrans=for=ma=tions."

Integer Mode ("int")

Returns segment lengths as integers (positive for words, negative for whitespace):

h = Hyphenator(mode="int", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: [3, -1, 5, 2, 4, 2, 3, 4, -1, 6, 3, -1, 9, -1, 8, 2, 2, 2, 5, -1, 5, 5, 4, -1, 6, 3, 6, -1, 3, -1, 5, 2, 2, 4, -1, 6, 7, -1, 8, -1, 7, 6, -1, 4, 2, 2, 6, -1, 5, 3, 2, 6]

Spans Mode ("spans")

Returns (start, end) tuples for word segments only (excluding whitespace):

h = Hyphenator(mode="spans", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: [(0, 3), (4, 9), (9, 11), (11, 15), (15, 17), (17, 20), (20, 24), (25, 31), (31, 34), (35, 44), (45, 53), (53, 55), (55, 57), (57, 59), (59, 64), (65, 70), (70, 75), (75, 79), (80, 86), (86, 89), (89, 95), (96, 99), (100, 105), (105, 107), (107, 109), (109, 113), (114, 120), (120, 127), (128, 136), (137, 144), (144, 150), (151, 155), (155, 157), (157, 159), (159, 165), (166, 171), (171, 174), (174, 176), (176, 182)]

Language Support

You can specify different languages using language codes:

# German hyphenation
h_de = Hyphenator(language="de_DE")

# French hyphenation  
h_fr = Hyphenator(language="fr_FR")

Requirements

  • Python 3.9+

License

This project is licensed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl (29.6 kB view details)

Uploaded CPython 3.9+Windows ARM64

hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl (32.4 kB view details)

Uploaded CPython 3.9+Windows x86-64

hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl (61.8 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl (60.4 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (63.9 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (63.5 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl (31.2 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl.

File metadata

  • Download URL: hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: CPython 3.9+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 739bc1f02b9b040aa0b118386a5a9d724255f795404a9bda39a9c18c963abf26
MD5 b9849784c6f89e88fda3070f89b52ff7
BLAKE2b-256 40f1dfd911635090f898249a3c39204ca8acf27810b50a336fc16350783a2d77

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 32.4 kB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 bb9be021363d0f44b0ec2ba453aa519eae06da32f0c548b1a324833aa56c2601
MD5 1a75a9c7b6d08bf251679f05a11bb26f
BLAKE2b-256 b386892202355e8c4df2a75ec8e89c61891078ef1f9e37c8b04c1cc22d1d2b85

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 fdd6bb5a699ef56c73e54a480f76533829396aacdb119f20b0bdf338af1112e1
MD5 b8745a4bf1f86143101843868ff395d9
BLAKE2b-256 ceb1672ae6a819af1363338fe93fb7738739319a01d6ebbbd0e3e1e3f4b4dca6

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 3897d4ccfdb75baab270943396e9e7d880c3ef98e0cd93eeca0d99ff947a1035
MD5 1796452249b515090e59b7a707efddee
BLAKE2b-256 c1f47e40fca01c29cf9a6e0b6a3eceee866bf1b1e1a989311ea6d7b24d5a4a29

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 aee7b92e0a2976a102622fb47acd9d3c110bed8846e52df5afa274de39dfec1e
MD5 dd0464b701995dc689ef25ebc4aef8ac
BLAKE2b-256 b486a30326d4cee213942baffaf136d2678656b340c596b45af840f7be6f579f

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 70e9bf4c01f2ef80ec160801e45a8b64e53862faec5f2383f4201bf0a36165b0
MD5 3422eed9ce1cf3ddd846ecfa1ada4ca6
BLAKE2b-256 63432f8babfa23799001f08c7f6d9337f13a18c62eeefb217d6c13736b637c71

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6a7aecc08ca3c25fe078c9ccdc2c387334e589949163fe4ce2ebbb517e341354
MD5 3c64d040be76d2e6bbc31e4bff0a20cb
BLAKE2b-256 c8db20ff30065bdf6017df9c1f3d1ce5080cc2eab0d6aabbcc9e129e57c56fff

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on digi-deity/hyperhyphen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page