Skip to main content

Convert Markdown into plain text and Telegram message entities.

Project description

telegram-markdown-entities

Stop fighting Telegram’s Markdown/HTML parser.

Ship text + entities for zero-escape, zero-surprise, exact rendering—where unsupported bits safely fall back to plain text.

If you’re tired of Telegram MarkdownV2/HTML parse errors:

  • “Can’t parse entities” from MarkdownV2 — special characters not escaped (_ * [ ] ( ) ~ > # + - = | { } . !`), or escaping in the wrong context (e.g., inside code).
  • Unbalanced delimiters — missing/misplaced *, _, `, ~~, ||, or code fences ….
  • Illegal nesting/overlap — e.g., mixing bold _italic_ or putting styles inside code/pre.
  • Broken links — label with spaces/parentheses not URL-encoded ( , ), (), or unmatched brackets.
  • HTML tag issues — unknown/disallowed tags/attributes, mis-nested tags like , unclosed tags.
  • Double parsing — sending both parse_mode and manual entities leads to surprises; Telegram’s parser still interferes.
  • Edge text — underscores in words/URLs, emoji/ZWJ sequences that shift what the parser thinks are boundaries.

The new paradigm: entities-only (no parse_mode)

We don’t ask Telegram to parse. We send text + entities, so formatting is explicit and deterministic.

  • No escaping ever. Special characters stay as-is; styles are applied by offsets/lengths, not by punctuation.
  • UTF-16–correct offsets. Emoji, non-BMP symbols, ZWJ sequences—handled; entity bounds stay valid.
  • No illegal overlaps. code/pre are atomic; we prevent forbidden nests before sending.
  • Graceful fallback. Anything we don’t support is left as plain text—safe, readable, no runtime errors.
  • Future-proof. Parser changes on Telegram’s side don’t break you; your rendering remains stable.

Convert Markdown(or any) text into plain(valid) telegram messages and with ease.

This library takes a string written in standard Markdown (such as the output of a language model or contents of a README) and returns two objects:

  1. Plain text with all Markdown delimiters removed.
  2. A list of message entity dictionaries that tell the Telegram Bot API how to format the text (bold, italic, links, lists, block quotes, etc.).

By sending the text together with the entities array (and not specifying a parse_mode) you avoid the pitfalls of Telegram’s own Markdown parser – there’s no need to escape special characters, and your messages render exactly as intended.

Send the text with an entities array and without parse_mode to bypass Telegram’s Markdown quirks—no escaping needed, and the message renders exactly as you designed.

Pair your text with entities (skip parse_mode) to sidestep Telegram’s Markdown parser: no special-char escaping, just precise, predictable rendering.

Use entities instead of parse_mode to avoid Telegram’s Markdown pitfalls—no escape gymnastics, and the output matches your intent.

Deliver text plus entities (no parse_mode) and you’ll dodge parser surprises: zero escaping and faithful, deterministic formatting.

By sending entities alongside the text and omitting parse_mode, you eliminate Markdown parsing issues—nothing to escape, and the result is pixel-perfect.

Ship the message with entities only; don’t set parse_mode. You’ll skip Telegram’s parser entirely, so special characters are safe and formatting is exact.

Features

  • Inline formatting: supports bold (**text**), italic (*text* or _text_), underline (__text__), strikethrough (~~text~~), spoilers (||text||), inline code (`code`), code blocks (lang\ncode), and links ([label](url)).
  • Headings: lines starting with # are converted to bold text.
  • Block quotes: lines beginning with > produce a blockquote entity; prefixing the quote with || (e.g. >|| quote) marks it as collapsed/expandable. This maps to the collapsed flag on Telegram’s messageEntityBlockquote type【885819747867534†L114-L123】.
  • Lists: unordered lists use Unicode bullets – , and depending on nesting depth – and indent with non‑breaking spaces; ordered lists align numbers using figure spaces and support nested numbering.
  • Nested formatting: bold inside italics, links inside quotes and other combinations all work as expected.
  • UTF‑16 offsets: entity offsets and lengths are calculated according to the UTF‑16 code unit rules used by Telegram【16645222028428†L69-L79】.

Installation

Install the package from PyPI:

pip install telegram-markdown-entities

Requires Python 3.7 or newer. There are no external dependencies.

Usage

Here’s a minimal example of how to use the library with the Bot API:

from telegram_markdown import parse_markdown_to_entities
import requests

md = """
# Heading Example

>|| This is a collapsed quote\n> It continues here.

* Item 1
    * Nested item
1. First
2. Second\n   continuation

Inline example: **bold**, _italic_, [link](https://example.com) and `code`.
"""

text, entities = parse_markdown_to_entities(md)

# Send via HTTP API (replace TOKEN and CHAT_ID with your own)
payload = {
    'chat_id': CHAT_ID,
    'text': text,
    'entities': entities
}
requests.post(f'https://api.telegram.org/bot{TOKEN}/sendMessage', json=payload)

The text variable will contain the plain message (with list markers and quote markers removed), and entities will be a list of dictionaries like {'type': 'bold', 'offset': 0, 'length': 6}. Pass these directly to sendMessage. There is no need to set the parse_mode parameter.

Packaging and publishing

This project uses a modern pyproject.toml with setuptools. To build a source distribution and wheel, install the build tool and run:

pip install build
python -m build

Distributions will be created in the dist/ directory. To upload them to the Python Package Index (PyPI), install twine and run:

pip install twine
twine upload dist/*

You will be prompted for your PyPI username and password. See https://packaging.python.org/tutorials/packaging-projects/ for full details.

License

MIT – see the LICENSE file for details.

Links

Telegram message entities

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telegram_markdown_entities-0.1.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

telegram_markdown_entities-0.1.0-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file telegram_markdown_entities-0.1.0.tar.gz.

File metadata

File hashes

Hashes for telegram_markdown_entities-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c3b3a2eff49899f4ee644239c42ec19b14a620939bd51e5648927219e222ed46
MD5 f281d87466a868aad58c861620768e7b
BLAKE2b-256 aaf28dd42e06c2d29ad86b92b2cb7c70967701ad278b1c075b8947dab695208c

See more details on using hashes here.

File details

Details for the file telegram_markdown_entities-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for telegram_markdown_entities-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 109bf4b2a377336577e62f6e03ada9bc1b8c1abda924d37066f52dec6eee60d6
MD5 61eed0b2246b18ac79f9f0cac3e047dc
BLAKE2b-256 598448f1413c63b26b3154c6a44e00daa3c07550efa74997833d756fe354f7ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page