Intelligent line breaking for Markdown and text files
Project description
Linebreaker
I created this tool because I couldn't find a reliable line breaking utility that works with Quarto markdown without altering headers, lists, or other formatting. This tool is conservative - it preserves your document structure and only adds line breaks when both the preceding and following text segments are sufficiently long.
Features
Intelligent line breaking for Markdown and text files, with support for:
- Citations in format
[@...] - Decimal numbers
- Common abbreviations (Dr., Prof., vs., et al., etc.)
- YAML headers
- Code blocks
- Soft breaks on conjunctions, commas, and/or
Installation
Install from PyPI using pixi:
pip install linebreaker
pixi add --pypi linebreaker
Or install from source:
git clone https://github.com/silas/linebreaker.git
cd linebreaker
pixi install
Usage
As a command-line tool:
# Process a single file
linebreaker your_file.md
# Process a directory
linebreaker writing/
# For compatibility, you can still use the old script
python -m linebreaker.cli your_file.md
⚠️ Important: Only use this tool on files that are tracked by a version control system like Git. Line breaking modifies your files, and having version control ensures you can review and revert changes if needed.
As a module:
from linebreaker import format_line, break_text
# Format a single line
result = format_line("Your text here...")
# Process entire text with YAML/code blocks
result = break_text(full_text)
Running Tests
# Run all tests
pytest linebreaker/tests/
Detailed Features
Hard Breaks (Sentence Boundaries)
- Splits on
.,?,!when both before and after have 20+ characters - Avoids common abbreviations: vs., Dr., Prof., Mr., Mrs., Ms., Ph.D., M.D., Jr., Sr., etc., e.g., i.e., et al., vol., no., pp., fig.
Medium Breaks (Colons/Semicolons)
- Splits sentences longer than 80 characters at
:or; - Only if both parts have 20+ characters
Soft Breaks (Conjunctions)
- Applied when there are 3+ sentences or sentence is >60 characters
- Breaks on:
but,such as,for example,e.g.,i.e.(after 20 chars) - Breaks on commas (after 40 chars)
- Breaks on
and,or(after 40 chars)
Smart Masking
- Citations
[@...]are masked to prevent dots inside from triggering breaks - Decimal numbers like
0.85are masked similarly
Development
To add new abbreviations, edit the abbreviations pattern in core.py:
abbreviations = r'(?!vs\.|dr\.|prof\.|...|your_abbrev\.)'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file linebreaker-0.0.0.tar.gz.
File metadata
- Download URL: linebreaker-0.0.0.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdee2bb84ac2e492f2e15853758a5f9f89d9a044e48d2526531c695cb679c650
|
|
| MD5 |
04a0f9c4e03ca7ed04c540f90784ed46
|
|
| BLAKE2b-256 |
f556d41aaac60c3cbbe911779a67c2af0db17f5685184c6ccbcfc3be03ce4e45
|
File details
Details for the file linebreaker-0.0.0-py3-none-any.whl.
File metadata
- Download URL: linebreaker-0.0.0-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a764522a012a778a1f22c105ed41ce8a29221d38140dd8189cfa2c623106bd4
|
|
| MD5 |
cd744f1cda1cc77eb18d93866ab9489b
|
|
| BLAKE2b-256 |
d3276c3fc3eb0d0ebdb0b7ed69b49440e88ea2a092a9274439270d66e4ef997e
|