Flexible library to convert a text with custom markup to html (or anything else).
Project description
styled-text (Python version)
The Python version of the styled-text library. Designed for custom markup transformations.
This library is for anyone who wants to create styled text like markdown, but with total flexibility to create their own rules.
Installation
pip install styled-text
Usage
import re
from text_styler import TextStyler, TextStylerRegexRule, TextStylerRule, html_tag
# Let's style this text:
text = "_Welcome_ to _<~my library~>*styled-text*_ version 0.0.1"
# Create the rules (only need to do this once)
style_rules = [
TextStylerRule(start="*", transform=html_tag("strong")),
TextStylerRule(start="_", transform=html_tag("em")),
TextStylerRule(start="<~", transform=html_tag("del"), end="~>"),
TextStylerRegexRule(
regex=re.compile(r"(\d+\.\d+\.\d+)"),
replace=r"<span style='color: red'>\1</span>",
),
]
# Create the styler:
styler = TextStyler(style_rules)
# Process text
html = styler.process_text(text)
# `html` looks like this now:
# <em>Welcome</em> to <em><del>my library</del><strong>styled-text</strong></em> version <span style='color: red'>0.0.1</span>
Examples
Simple bold
TextStylerRule(
start='*',
transform=html_tag("strong")
)
Input: My *bolded* text
Output (raw): My <strong>bolded</strong> text
Output (visual): My bolded text
Nested bold/italic
TextStylerRule(
start='*',
transform=html_tag("strong")
),
TextStylerRule(
start='_',
transform=html_tag("em")
)
Input: My *bolded and _italicized_ text*
Output (raw): My <strong>bolded and <em>italicized</em> text</strong>
Output (visual): My bolded and italicized text
Input: Three *asterisks* matches* eagerly
Output (raw): Three <strong>asterisks</strong> matches* eagerly
Output (visual): Three asterisks matches* eagerly
Input: Overlapping * tags _ also * matches _ eagerly
Output (raw): Overlapping <strong> tags _ also </strong> matches _ eagerly
Output (visual): Overlapping tags _ also matches _ eagerly
Nested / Conflicting Tags
Here we show two things:
startcan be multiple characters (~~for strikethrough)- one rule can be a subset of another, and it still works as expected (
~for subscript)
TextStylerRule(
start="~",
transform=html_tag("sub")
),
TextStylerRule(
start="~~",
transform=html_tag("del")
)
Input: H\~\~\~3\~\~2\~O
Output (raw): H<sub><del>3</del>2</sub>O
Output (visual): H32O
Input: A \~\~\~[sic]\~tyop\~\~ typo is...
Output (raw): H<del><sub>[sic]<sub>tyop</del> typo is...
Output (visual): H[sic]tyop typo is...
Regexes
Regexes are the best way to built a complex replacement strategy, like if you need to parse the inner text into pieces, or use the inner text multiple times, such as in this example, where the matched url is used both as the property href and as the link text:
TextStylerRegexRule(
regex=re.compile(r"https://www.[^\.]+.com),
replace=r"<a href='\\g<0>'>\\g<0></a>"
)
Input: My link https://www.google.com
Output (raw): My link <a href='https://www.google.com'>https://www.google.com</a>
Output (visual): My link https://www.google.com
However, regexes are matched like literal strings, meaning that any styling within them is not matched by any other rules.
For example, even if we included the rule from asterisks to <strong> that we've used before, it will not use it to match within our regex:
Input: My link https://www.*google*.com
Output (raw): My link <a href='https://www.*google*.com'>https://www.*google*.com</a>
Output (visual): My link https://www.*google*.com
Preserving the special characters
By default, the special characters are removed from the output, but they can be preserved on the inside or on the outside:
TextStylerRule(
start='*',
transform=html_tag("strong"),
consume_start=ConsumptionStyle.OUTSIDE,
consume_end=ConsumptionStyle.OUTSIDE,
),
TextStylerRule(
start='_',
transform=html_tag("em")
consume_start=ConsumptionStyle.INSIDE,
consume_end=ConsumptionStyle.INSIDE,
)
Input: My *bolded* text, my _italicized_ text
Output (raw): My <strong>*bolded*</strong> text, my _<em>italicized</em>_ text
Output (visual): My *bolded* text, my _italicized_ text
Disallowing self-nesting
By default, a rule nesting within itself is allowed, but this can be disabled in two ways:
- Completely disallowed, at any depth
- A direct parent-child is disallowed, but grandparent-grandchild (or more distant) is allowed
TextStylerRule(
start='*',
transform=html_tag("strong"),
allow_inner=InnerStyle.DISALLOW_DIRECT,
),
TextStylerRule(
start='^',
transform=html_tag("sup")
allow_inner=InnerStyle.DISALLOW_ANCESTOR,
),
TextStylerRule(
start='~',
transform=html_tag("sub")
allow_inner=InnerStyle.DISALLOW_DIRECT,
)
Input: Subscript ~cannot exist ~directly~ within subscript, but *can exist ~within~ the bolded* region~
Output (raw): Subscript <sub>cannot exist ~directly~ within subscript, but <strong>can exist <sub>within</sub> the bolded</strong> region</sub>
Output (visual): Subscript cannot exist ~directly~ within subscript, but can exist within the bolded region`
Input: Superscript ^of multiple depths is ^disallowed^, *even if we ^wrap^ it in a bolded* region^
Output (raw): Superscript <sup>of multiple depths is ^disallowed^, <strong>even if we ^wrap^ it in a bolded</strong> region</sup>
Output (visual): Superscript of multiple depths is ^disallowed^, even if we ^wrap^ it in a bolded region
Reference
To use the library, just set up a list of "rules", create a TextStyler object, then call process_text.
| Class / Function | Parameter | Type | Default | Description |
|---|---|---|---|---|
| TextStyler | rules | list | Required | A list of TextStylerRule or TextStylerRegexRule objects. |
| TextStylerRegexRule | regex | str | Required | The regular expression pattern to match. |
| replace | str | Required | The replacement string (supports regex capture groups like \1). | |
| TextStylerRule | start | str | Required | The marker string that begins the rule. |
| transform | Callable[str, str] | Required | "Function to process inner content (e.g., html_tag)." | |
| end | str | start | The marker string that terminates the rule. | |
| consume_start | ConsumptionType | REPLACE | "Determines if start is included in output (INSIDE, OUTSIDE, REPLACE)." | |
| consume_end | ConsumptionType | REPLACE | "Determines if end is included in output (INSIDE, OUTSIDE, REPLACE)." | |
| allow_inner | InnerStyle | ALLOW | "Determines if self-nesting is allowed (ALLOW, DISALLOW_DIRECT, DISALLOW_ANCESTOR)." | |
| html_tag | name | str | Required | The HTML tag name (e.g., "strong"). |
| attrs | dict | {} |
Optional HTML attributes (e.g., {"class": "my-css-class"}). |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file styled_text-0.1.2.tar.gz.
File metadata
- Download URL: styled_text-0.1.2.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1c0852f02868572be5abf80cacc353373a8b6933f4f3a14722f95e7e148fcf4
|
|
| MD5 |
920e188281374fb3e665c8549f9b0f99
|
|
| BLAKE2b-256 |
334aa15f8a8563a23ad7109d6f072fa860523086a4827e8f755ef677353cedb8
|
File details
Details for the file styled_text-0.1.2-py3-none-any.whl.
File metadata
- Download URL: styled_text-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4756692c6afc0affa6e9e76228ccf836180d46895b5aa933e1d592583e42865
|
|
| MD5 |
a0668fb907ef521c86a2ed17e75758d7
|
|
| BLAKE2b-256 |
c555146d1d60c08a7ae6b8eec8bda7e2a355f9309889b5e63aaf83807be4270a
|