Convert Markdown to Telegram plain text + MessageEntity pairs

These details have been verified by PyPI

Project links

repository

GitHub Statistics

Maintainers

Yumiya233

Project description

telegramify-markdown

GitHub Repo stars

Effortlessly convert raw Markdown to Telegram plain text + MessageEntity pairs.

Say goodbye to MarkdownV2 escaping headaches! This library parses Markdown (including LLM output, GitHub READMEs, etc.) and produces (text, entities) tuples that can be sent directly via the Telegram Bot API — no parse_mode needed.

No matter the format or length, it can be easily handled!
Entity offsets are measured in UTF-16 code units, exactly as Telegram requires.
We also support LaTeX-to-Unicode conversion, expandable block quotes, and Mermaid diagram rendering.
Built on pyromark (Rust pulldown-cmark bindings) for speed and correctness.

[!NOTE] v1.0.0 introduces a new entity-based output: convert() returns (str, list[MessageEntity]). The 0.x functions markdownify() and standardize() are still available and return MarkdownV2 strings as before.

👀 Use case

convert()	convert()	telegramify()

🪄 Quick Start

Install

Requires Python 3.10+.

# uv (recommended)
uv add telegramify-markdown
uv add "telegramify-markdown[mermaid]"

# pip
pip install telegramify-markdown
pip install "telegramify-markdown[mermaid]"

# PDM
pdm add telegramify-markdown
pdm add "telegramify-markdown[mermaid]"

# Poetry
poetry add telegramify-markdown
poetry add "telegramify-markdown[mermaid]"

🤔 What you want to do?

If you just want to send static text and don't want to worry about formatting → use convert()
If you are developing an LLM application or need to send potentially super-long text → use telegramify()
If you need to split convert() output manually → use split_entities()
If your middleware only supports parse_mode="MarkdownV2" (no entities parameter) → use markdownify()
If you need finer control over the reverse conversion → use entities_to_markdownv2()

`convert()` — single message

from telebot import TeleBot
from telegramify_markdown import convert

bot = TeleBot("YOUR_TOKEN")

md = "**Bold**, _italic_, and `code`."
text, entities = convert(md)

bot.send_message(
    chat_id,
    text,
    entities=[e.to_dict() for e in entities],
)

No parse_mode parameter — Telegram reads the entities directly.

`telegramify()` — long messages, code files, diagrams

For LLM output or long documents, telegramify() splits text, extracts code blocks as files, and renders Mermaid diagrams as images:

import asyncio
from telebot import TeleBot
from telegramify_markdown import telegramify
from telegramify_markdown.content import ContentType

bot = TeleBot("YOUR_TOKEN")

md = """
# Report

Here is some analysis with **bold** and _italic_ text.

```python
print("hello world")
```

And a diagram:

```mermaid
graph TD
    A-->B
```
"""

async def send():
    results = await telegramify(md, max_message_length=4090)
    for item in results:
        if item.content_type == ContentType.TEXT:
            bot.send_message(
                chat_id,
                item.text,
                entities=[e.to_dict() for e in item.entities],
            )
        elif item.content_type == ContentType.PHOTO:
            bot.send_photo(
                chat_id,
                (item.file_name, item.file_data),
                caption=item.caption_text or None,
                caption_entities=[e.to_dict() for e in item.caption_entities] or None,
            )
        elif item.content_type == ContentType.FILE:
            bot.send_document(
                chat_id,
                (item.file_name, item.file_data),
                caption=item.caption_text or None,
                caption_entities=[e.to_dict() for e in item.caption_entities] or None,
            )

asyncio.run(send())

`split_entities()` — manual splitting

If you use convert() but need to split long output yourself:

from telegramify_markdown import convert, split_entities

text, entities = convert(long_markdown)

for chunk_text, chunk_entities in split_entities(text, entities, max_utf16_len=4096):
    bot.send_message(
        chat_id,
        chunk_text,
        entities=[e.to_dict() for e in chunk_entities],
    )

`markdownify()` — direct Markdown to MarkdownV2

If your middleware only supports parse_mode="MarkdownV2" and cannot pass entities, use markdownify() for a one-step conversion:

from telegramify_markdown import markdownify

mdv2 = markdownify("**Bold** and `code`")
bot.send_message(chat_id, mdv2, parse_mode="MarkdownV2")

standardize() is an alias for markdownify(), kept for 0.x compatibility.

`entities_to_markdownv2()` — reverse conversion to MarkdownV2

If you already have (text, entities) from convert() and need a MarkdownV2 string:

from telegramify_markdown import convert, entities_to_markdownv2

text, entities = convert("**Bold** and `code`")
mdv2 = entities_to_markdownv2(text, entities)

bot.send_message(chat_id, mdv2, parse_mode="MarkdownV2")

This handles all MarkdownV2 escaping rules correctly (different escaping for normal text, code/pre blocks, and URLs).

⚙️ Configuration

Customize heading symbols, link symbols, and expandable citation behavior:

from telegramify_markdown.config import get_runtime_config

cfg = get_runtime_config()
cfg.markdown_symbol.heading_level_1 = "📌"
cfg.markdown_symbol.link = "🔗"
cfg.cite_expandable = True  # Long quotes become expandable_blockquote

# For clean output without emoji heading prefixes:
# cfg.markdown_symbol.heading_level_1 = ""
# cfg.markdown_symbol.heading_level_2 = ""
# cfg.markdown_symbol.heading_level_3 = ""
# cfg.markdown_symbol.heading_level_4 = ""

📖 API Reference

`convert(markdown, *, latex_escape=True) -> tuple[str, list[MessageEntity]]`

Synchronous. Converts a Markdown string to plain text and a list of MessageEntity objects.

Parameter	Type	Default	Description
`markdown`	`str`	required	Raw Markdown text
`latex_escape`	`bool`	`True`	Convert LaTeX `\(...\)` and `\[...\]` to Unicode symbols

Returns (text, entities) where text is plain text and entities is a list of MessageEntity.

`telegramify(content, *, max_message_length=4096, latex_escape=True) -> list[Text | File | Photo]`

Async. Full pipeline: converts Markdown, splits long messages, extracts code blocks as files, renders Mermaid diagrams as images.

Parameter	Type	Default	Description
`content`	`str`	required	Raw Markdown text
`max_message_length`	`int`	`4096`	Max UTF-16 code units per text message
`latex_escape`	`bool`	`True`	Convert LaTeX to Unicode

Returns an ordered list of Text, File, or Photo objects.

`split_entities(text, entities, max_utf16_len) -> list[tuple[str, list[MessageEntity]]]`

Split text + entities into chunks within a UTF-16 length limit. Splits at newline boundaries; entities spanning a split point are clipped into both chunks.

`markdownify(content, *, latex_escape=True) -> str`

Synchronous. Converts Markdown directly to a Telegram MarkdownV2 string. Equivalent to entities_to_markdownv2(*convert(content)).

Parameter	Type	Default	Description
`content`	`str`	required	Raw Markdown text
`latex_escape`	`bool`	`True`	Convert LaTeX to Unicode

`standardize(content, *, latex_escape=True) -> str`

Alias for markdownify(), kept for 0.x compatibility.

`entities_to_markdownv2(text, entities=None) -> str`

Reverse conversion: takes plain text and entities, returns a MarkdownV2 string with correct escaping. Useful when you already have (text, entities) from convert() and need a MarkdownV2 string.

Parameter	Type	Default	Description
`text`	`str`	required	Plain text content
`entities`	`list[MessageEntity] \| None`	`None`	Entity list (UTF-16 offsets)

`MessageEntity`

@dataclasses.dataclass(slots=True)
class MessageEntity:
    type: str           # "bold", "italic", "code", "pre", "text_link", etc.
    offset: int         # Start position in UTF-16 code units
    length: int         # Length in UTF-16 code units
    url: str | None     # For "text_link" entities
    language: str | None       # For "pre" entities (code block language)
    custom_emoji_id: str | None  # For "custom_emoji" entities

    def to_dict(self) -> dict: ...

Content Types

Class	Fields	Description
`Text`	`text`, `entities`, `content_trace`	A text message segment
`File`	`file_name`, `file_data`, `caption_text`, `caption_entities`, `content_trace`	An extracted code block
`Photo`	`file_name`, `file_data`, `caption_text`, `caption_entities`, `content_trace`	A rendered Mermaid diagram

`utf16_len(text) -> int`

Returns the length of a string in UTF-16 code units (what Telegram uses for offsets).

🔨 Supported Markdown Features

Headings (Levels 1-6: H1-H2 bold+underline, H3-H4 bold, H5-H6 italic; H1-H4 with emoji prefix)
**Bold**, *Italic*, ~~Strikethrough~~
||Spoiler||
[Links](url) and ![Images](url)
Telegram custom emoji ![emoji](tg://emoji?id=...)
Inline code and fenced code blocks
Block quotes > (with expandable citation support)
Tables (rendered as monospace pre blocks)
Ordered and unordered lists
Task lists - [x] / - [ ]
Horizontal rules ---
LaTeX math \(...\) and \[...\] (converted to Unicode)
Mermaid diagrams (rendered as images, requires [mermaid] extra)

🤖 For AI Coding Assistants

Copy this block into your AI assistant's context (e.g. CLAUDE.md, Cursor Rules, etc.) to get accurate code generation for telegramify-markdown:

Click to expand context block

# telegramify-markdown integration guide

## Install
uv add telegramify-markdown  # or: pip install telegramify-markdown

## API (v1.0.0+) — outputs plain text + MessageEntity, NOT MarkdownV2 strings

### convert() — sync, single message
from telegramify_markdown import convert
text, entities = convert("**bold** and _italic_")
bot.send_message(chat_id, text, entities=[e.to_dict() for e in entities])
# Do NOT set parse_mode — entities replace it entirely.

### telegramify() — async, auto-splits long text, extracts code blocks as files
from telegramify_markdown import telegramify
from telegramify_markdown.content import ContentType
results = await telegramify(md, max_message_length=4090)
for item in results:
    if item.content_type == ContentType.TEXT:
        bot.send_message(chat_id, item.text, entities=[e.to_dict() for e in item.entities])
    elif item.content_type == ContentType.FILE:
        bot.send_document(chat_id, (item.file_name, item.file_data))
    elif item.content_type == ContentType.PHOTO:
        bot.send_photo(chat_id, (item.file_name, item.file_data))

### split_entities() — manual splitting for convert() output
from telegramify_markdown import convert, split_entities
text, entities = convert(long_md)
for chunk_text, chunk_entities in split_entities(text, entities, max_utf16_len=4096):
    bot.send_message(chat_id, chunk_text, entities=[e.to_dict() for e in chunk_entities])

### markdownify() — direct Markdown to MarkdownV2 string
from telegramify_markdown import markdownify
mdv2 = markdownify("**Bold** and `code`")
bot.send_message(chat_id, mdv2, parse_mode="MarkdownV2")
# Use when your middleware only supports parse_mode, not entities parameter.
# standardize() is an alias for markdownify().

### entities_to_markdownv2() — reverse convert() output to MarkdownV2
from telegramify_markdown import convert, entities_to_markdownv2
text, entities = convert("**Bold** and `code`")
mdv2 = entities_to_markdownv2(text, entities)
bot.send_message(chat_id, mdv2, parse_mode="MarkdownV2")

### Configuration
from telegramify_markdown.config import get_runtime_config
cfg = get_runtime_config()
cfg.markdown_symbol.heading_level_1 = "📌"
cfg.cite_expandable = True

## Critical rules
- entities must be passed as list[dict] via [e.to_dict() for e in entities], NEVER as JSON string
- NEVER set parse_mode when sending with entities — they are mutually exclusive
- All entity offsets are UTF-16 code units. Use utf16_len() to measure text length.
- Requires Python 3.10+

🧸 Acknowledgement

This library is inspired by npm:telegramify-markdown.

LaTeX escape is inspired by latex2unicode and @yym68686.

📜 License

This project is licensed under the MIT License — see the LICENSE file for details.

Project details

These details have been verified by PyPI

Project links

repository

GitHub Statistics

Maintainers

Yumiya233

Release history Release notifications | RSS feed

1.1.3

Apr 23, 2026

1.1.2

Apr 6, 2026

This version

1.1.1

Mar 18, 2026

1.1.0

Mar 14, 2026

1.0.0

Mar 10, 2026

1.0.0rc5 pre-release

Mar 3, 2026

1.0.0rc4 pre-release

Feb 8, 2026

1.0.0rc2 pre-release

Feb 6, 2026

0.5.4

Dec 20, 2025

0.5.3

Dec 9, 2025

0.5.2

Oct 14, 2025

0.5.1

Apr 8, 2025

0.5.0

Mar 11, 2025

0.4.4

Mar 11, 2025

0.4.3

Feb 28, 2025

0.4.2

Feb 14, 2025

0.4.1

Jan 22, 2025

0.4.0

Jan 13, 2025

0.3.2

Jan 3, 2025

0.3.1

Dec 17, 2024

0.3.0

Dec 15, 2024

0.2.3

Dec 14, 2024

0.2.2

Dec 13, 2024

0.2.1

Dec 12, 2024

0.2.0

Dec 11, 2024

0.1.17

Nov 21, 2024

0.1.16

Nov 19, 2024

0.1.15

Oct 21, 2024

0.1.14

Oct 21, 2024

0.1.13

Sep 27, 2024

0.1.12

Sep 11, 2024

0.1.11

Aug 29, 2024

0.1.10

Aug 11, 2024

0.1.9

Jul 26, 2024

0.1.8

Jul 4, 2024

0.1.7

Jun 27, 2024

0.1.6

Jun 2, 2024

0.1.5

Jun 1, 2024

0.1.4

May 25, 2024

0.1.3

May 25, 2024

0.1.2

Mar 17, 2024

0.1.1

Mar 13, 2024

0.1.0

Mar 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telegramify_markdown-1.1.1.tar.gz (58.6 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

telegramify_markdown-1.1.1-py3-none-any.whl (41.5 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file telegramify_markdown-1.1.1.tar.gz.

File metadata

Download URL: telegramify_markdown-1.1.1.tar.gz
Upload date: Mar 18, 2026
Size: 58.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.6 CPython/3.12.3 Linux/6.14.0-1017-azure

File hashes

Hashes for telegramify_markdown-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0fc66835b889e156ef47aa1c70a215368f9673f86cac23e539f5d9fa923a4c35`
MD5	`6cb24ec4543777d78d536d372c613e79`
BLAKE2b-256	`b0940b29f09c3c9d28cd7b8790c4bc8aa805b126ece87f2ddc5d0c28dc9cb922`

See more details on using hashes here.

File details

Details for the file telegramify_markdown-1.1.1-py3-none-any.whl.

File metadata

Download URL: telegramify_markdown-1.1.1-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 41.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.6 CPython/3.12.3 Linux/6.14.0-1017-azure

File hashes

Hashes for telegramify_markdown-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`860348bae4e4e107b3fc02333ff3f592888b1e6bd9da8291aaf210e094dbd005`
MD5	`e15952667e7a3262216895b92b0f217d`
BLAKE2b-256	`c56d71552ae1f5e959d36c9943ec5a04063ec9db69f9d524643c41a92ba422e7`

See more details on using hashes here.

telegramify-markdown 1.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

telegramify-markdown

👀 Use case

🪄 Quick Start

Install

🤔 What you want to do?

convert() — single message

telegramify() — long messages, code files, diagrams

split_entities() — manual splitting

markdownify() — direct Markdown to MarkdownV2

entities_to_markdownv2() — reverse conversion to MarkdownV2

⚙️ Configuration

📖 API Reference

convert(markdown, *, latex_escape=True) -> tuple[str, list[MessageEntity]]

telegramify(content, *, max_message_length=4096, latex_escape=True) -> list[Text | File | Photo]

split_entities(text, entities, max_utf16_len) -> list[tuple[str, list[MessageEntity]]]

markdownify(content, *, latex_escape=True) -> str

standardize(content, *, latex_escape=True) -> str

entities_to_markdownv2(text, entities=None) -> str

MessageEntity

Content Types

utf16_len(text) -> int

🔨 Supported Markdown Features

🤖 For AI Coding Assistants

🧸 Acknowledgement

📜 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`convert()` — single message

`telegramify()` — long messages, code files, diagrams

`split_entities()` — manual splitting

`markdownify()` — direct Markdown to MarkdownV2

`entities_to_markdownv2()` — reverse conversion to MarkdownV2

`convert(markdown, *, latex_escape=True) -> tuple[str, list[MessageEntity]]`

`telegramify(content, *, max_message_length=4096, latex_escape=True) -> list[Text | File | Photo]`

`split_entities(text, entities, max_utf16_len) -> list[tuple[str, list[MessageEntity]]]`

`markdownify(content, *, latex_escape=True) -> str`

`standardize(content, *, latex_escape=True) -> str`

`entities_to_markdownv2(text, entities=None) -> str`

`MessageEntity`

`utf16_len(text) -> int`