Hyper fast hyphenation for Python
Project description
HyperHyphen
HyperHyphen is a Python package that provides hyper fast word hyphenation. It supports multiple languages and allows for easy integration into Python applications.
This is not a feature complete implementation of the Hyphenator library. There are better libraries for hyphenation, such as PyHyphen or Pyphen.
This library will only suggest hyphenation points, without consideration of word modification or language-specific rules. You don't need to worry about irregular hyphenation and word-modification if the line-breaking / wrapping algorithm can pick from many other hyphenation points instead. This library hyphenation loop is completely written in C to ensure that plenty of such points can be suggested fast.
Installation
You can install HyperHyphen using pip:
pip install hyperhyphen
Usage
Looking at your code, I can see the package supports multiple output modes. Here are usage examples to add to your README.md:
Usage
Basic Usage
from hyperhyphen import Hyphenator
# Create a hyphenator for English (US)
h = Hyphenator(language="en_US")
# Hyphenate text (default "str" mode)
text = "reconciliation microprocessing"
result = h(text)
print(result)
# Output: ['recon', 'cil', 'i', 'a', 'tion', ' ', 'micro', 'pro', 'cess', 'ing']
Different Output Modes
HyperHyphen supports four different output modes:
String Mode ("str") - Default
Returns a list of hyphenated word parts and whitespace segments:
h = Hyphenator(mode="str", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: ['The', ' ', 'inter', 'na', 'tion', 'al', 'iza', 'tion', ' ', 'commit', 'tee', ' ', 'discussed', ' ', "'telecom", 'mu', 'ni', 'ca', 'tions', ' ', 'infra', 'struc', 'ture', ' ', 'modern', 'iza', "tion,'", ' ', 'but', ' ', 'extra', 'or', 'di', 'nary', ' ', 'circum', 'stances', ' ', 'required', ' ', 'unprece', 'dented', ' ', 'orga', 'ni', 'za', 'tional', ' ', 'trans', 'for', 'ma', 'tions.']
Raw Mode ("raw")
Returns hyphenated words with = separators, preserving original whitespace structure:
h = Hyphenator(mode="raw", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: "the\\ninter=na=tion=al=iza=tion\\ncommit=tee\\ndiscussed\\n\'telecom=mu=ni=ca=tions\\ninfra=struc=ture\\nmodern=iza=tion,\'\\nbut\\nextra=or=di=nary\\ncircum=stances\\nrequired\\nunprece=dented\\norga=ni=za=tional\\ntrans=for=ma=tions."
Integer Mode ("int")
Returns segment lengths as integers (positive for words, negative for whitespace):
h = Hyphenator(mode="int", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: [3, -1, 5, 2, 4, 2, 3, 4, -1, 6, 3, -1, 9, -1, 8, 2, 2, 2, 5, -1, 5, 5, 4, -1, 6, 3, 6, -1, 3, -1, 5, 2, 2, 4, -1, 6, 7, -1, 8, -1, 7, 6, -1, 4, 2, 2, 6, -1, 5, 3, 2, 6]
Spans Mode ("spans")
Returns (start, end) tuples for word segments only (excluding whitespace):
h = Hyphenator(mode="spans", language="en_US")
result = h("The internationalization committee discussed 'telecommunications infrastructure modernization,' but extraordinary circumstances required unprecedented organizational transformations.")
print(result)
# Output: [(0, 3), (4, 9), (9, 11), (11, 15), (15, 17), (17, 20), (20, 24), (25, 31), (31, 34), (35, 44), (45, 53), (53, 55), (55, 57), (57, 59), (59, 64), (65, 70), (70, 75), (75, 79), (80, 86), (86, 89), (89, 95), (96, 99), (100, 105), (105, 107), (107, 109), (109, 113), (114, 120), (120, 127), (128, 136), (137, 144), (144, 150), (151, 155), (155, 157), (157, 159), (159, 165), (166, 171), (171, 174), (174, 176), (176, 182)]
Language Support
You can specify different languages using language codes:
# German hyphenation
h_de = Hyphenator(language="de_DE")
# French hyphenation
h_fr = Hyphenator(language="fr_FR")
Requirements
- Python 3.9+
License
This project is licensed under the Apache 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl
- Upload date:
- Size: 29.6 kB
- Tags: CPython 3.9+, Windows ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
739bc1f02b9b040aa0b118386a5a9d724255f795404a9bda39a9c18c963abf26
|
|
| MD5 |
b9849784c6f89e88fda3070f89b52ff7
|
|
| BLAKE2b-256 |
40f1dfd911635090f898249a3c39204ca8acf27810b50a336fc16350783a2d77
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-win_arm64.whl -
Subject digest:
739bc1f02b9b040aa0b118386a5a9d724255f795404a9bda39a9c18c963abf26 - Sigstore transparency entry: 299626969
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 32.4 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb9be021363d0f44b0ec2ba453aa519eae06da32f0c548b1a324833aa56c2601
|
|
| MD5 |
1a75a9c7b6d08bf251679f05a11bb26f
|
|
| BLAKE2b-256 |
b386892202355e8c4df2a75ec8e89c61891078ef1f9e37c8b04c1cc22d1d2b85
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-win_amd64.whl -
Subject digest:
bb9be021363d0f44b0ec2ba453aa519eae06da32f0c548b1a324833aa56c2601 - Sigstore transparency entry: 299627095
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 61.8 kB
- Tags: CPython 3.9+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdd6bb5a699ef56c73e54a480f76533829396aacdb119f20b0bdf338af1112e1
|
|
| MD5 |
b8745a4bf1f86143101843868ff395d9
|
|
| BLAKE2b-256 |
ceb1672ae6a819af1363338fe93fb7738739319a01d6ebbbd0e3e1e3f4b4dca6
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl -
Subject digest:
fdd6bb5a699ef56c73e54a480f76533829396aacdb119f20b0bdf338af1112e1 - Sigstore transparency entry: 299626955
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl
- Upload date:
- Size: 60.4 kB
- Tags: CPython 3.9+, musllinux: musl 1.2+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3897d4ccfdb75baab270943396e9e7d880c3ef98e0cd93eeca0d99ff947a1035
|
|
| MD5 |
1796452249b515090e59b7a707efddee
|
|
| BLAKE2b-256 |
c1f47e40fca01c29cf9a6e0b6a3eceee866bf1b1e1a989311ea6d7b24d5a4a29
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl -
Subject digest:
3897d4ccfdb75baab270943396e9e7d880c3ef98e0cd93eeca0d99ff947a1035 - Sigstore transparency entry: 299626995
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 63.9 kB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aee7b92e0a2976a102622fb47acd9d3c110bed8846e52df5afa274de39dfec1e
|
|
| MD5 |
dd0464b701995dc689ef25ebc4aef8ac
|
|
| BLAKE2b-256 |
b486a30326d4cee213942baffaf136d2678656b340c596b45af840f7be6f579f
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
aee7b92e0a2976a102622fb47acd9d3c110bed8846e52df5afa274de39dfec1e - Sigstore transparency entry: 299627042
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
- Upload date:
- Size: 63.5 kB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70e9bf4c01f2ef80ec160801e45a8b64e53862faec5f2383f4201bf0a36165b0
|
|
| MD5 |
3422eed9ce1cf3ddd846ecfa1ada4ca6
|
|
| BLAKE2b-256 |
63432f8babfa23799001f08c7f6d9337f13a18c62eeefb217d6c13736b637c71
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl -
Subject digest:
70e9bf4c01f2ef80ec160801e45a8b64e53862faec5f2383f4201bf0a36165b0 - Sigstore transparency entry: 299627010
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 31.2 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a7aecc08ca3c25fe078c9ccdc2c387334e589949163fe4ce2ebbb517e341354
|
|
| MD5 |
3c64d040be76d2e6bbc31e4bff0a20cb
|
|
| BLAKE2b-256 |
c8db20ff30065bdf6017df9c1f3d1ce5080cc2eab0d6aabbcc9e129e57c56fff
|
Provenance
The following attestation bundles were made for hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on digi-deity/hyperhyphen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hyperhyphen-1.0.0-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
6a7aecc08ca3c25fe078c9ccdc2c387334e589949163fe4ce2ebbb517e341354 - Sigstore transparency entry: 299627068
- Sigstore integration time:
-
Permalink:
digi-deity/hyperhyphen@3db845032ab321fdacafd8e55673e2de015c8234 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/digi-deity
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db845032ab321fdacafd8e55673e2de015c8234 -
Trigger Event:
push
-
Statement type: