Generate language-learning HTML readers (with sentence-level LLM translations and TTS) from Markdown.

These details have not been verified by PyPI

Project links

Project description

md-llm-lang-reader

Generate language-learning HTML readers from Markdown using an LLM:

sentence-by-sentence splitting + translation
one-click TTS playback for the source text (browser Web Speech API)
fenced code blocks are preserved as code (not sent to the LLM)

This package is published on PyPI as md-llm-lang-reader, and installs the CLI command langreader.

Features

Markdown → HTML (simple headings + paragraphs)
LLM-assisted sentence splitting (natural sentence boundaries)
Sentence-level translations (each source sentence paired with its translation)
TTS button per source sentence
Fenced code blocks (``` or ~~~) are emitted as <pre><code> and are not sent to the LLM
Bullet lists are translated (no special handling; they are passed to the LLM as plain text)

Installation

pip install md-llm-lang-reader

Quick start

Create input.md:

# Example

Bonjour ! Ceci est un court paragraphe.

```python
# Code blocks are not translated.
print("Hello")

Premier point
Deuxième point


Generate `output.html`:

```bash
langreader \
  -i input.md \
  -o output.html \
  --src fr \
  --tgt en \
  --provider YOUR_PROVIDER \
  --model YOUR_MODEL

Open the generated HTML in your browser and click the speaker buttons.

CLI usage

langreader -i INPUT.md -o OUTPUT.html --src SRC --tgt TGT --provider PROVIDER --model MODEL [-v 0|1|2|3]

Options

-i, --input (required)
Input Markdown file path.
-o, --output (required)
Output HTML file path.
--src (default: fr)
Source language code (e.g. fr, de, es, ja).
--tgt (default: en)
Target language code.
--provider (required)
Provider name passed to multiai (depends on your multiai configuration).
--model (required)
Model name passed to multiai.
-v, --verbose (default: 1)
Controls terminal output:
- 0: silent
- 1: headings only
- 2: paragraph preview (first ~5 words)
- 3: full original paragraph text

Examples

French → English:

langreader -i alsace.md -o alsace.html --src fr --tgt en --provider ... --model ...

German → English:

langreader -i berlin.md -o berlin.html --src de --tgt en --provider ... --model ...

Japanese → English:

langreader -i news.md -o news.html --src ja --tgt en --provider ... --model ...

How it works

For each paragraph, the tool asks the LLM to:

Split the paragraph into natural sentences (avoid splitting on abbreviations).
Translate each sentence into the target language.
Return only valid JSON in this schema:

[
  { "src": "…", "tgt": "…" }
]

The tool validates and parses the JSON and then generates HTML like:

source sentence + TTS button
translated sentence below it

Notes on Text-to-Speech (TTS)

TTS uses the browser’s Web Speech API (speechSynthesis).
Voice availability depends on the OS/browser. Some environments may have limited voices for certain languages.
The tool sets the utterance language to --src (e.g. fr). If you need a specific locale (e.g. fr-FR), you can currently edit the generated HTML (a future CLI option could expose this).

Markdown support (current)

Supported:

Headings: #, ##, ###, ####
Paragraphs: consecutive non-empty lines are joined with spaces
Fenced code blocks: ``` or ~~~ (any info string is allowed)

Not yet supported (treated as plain text or not specially parsed):

Blockquotes, tables, images
Inline formatting (links/emphasis) is not rendered; it is passed as plain text

If you need richer Markdown rendering, consider adding a Markdown parser and preserving a mapping between original text and rendered HTML.

Security

This tool escapes text embedded into HTML and does not inline arbitrary text into onclick handlers. TTS buttons store text in data-speak="..." attributes and use JS event listeners, which is safer and avoids quoting issues.

Still, treat generated HTML as untrusted if your input Markdown is untrusted.

Development

Clone and install in editable mode:

pip install -e .

Run tests:

pytest

Build the package:

python -m build

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Dec 20, 2025

This version

0.1.0

Dec 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

md_llm_lang_reader-0.1.0.tar.gz (8.3 kB view details)

Uploaded Dec 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

md_llm_lang_reader-0.1.0-py3-none-any.whl (8.7 kB view details)

Uploaded Dec 19, 2025 Python 3

File details

Details for the file md_llm_lang_reader-0.1.0.tar.gz.

File metadata

Download URL: md_llm_lang_reader-0.1.0.tar.gz
Upload date: Dec 19, 2025
Size: 8.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for md_llm_lang_reader-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fdffe43775a03de898eaa5b0c29eba862feb4c3946053c3e8708f37b44c2cd00`
MD5	`fe35267e2fc1c063d5d6acc69c5100d3`
BLAKE2b-256	`ee57a754839f92892e6b7ccea2cc582740449184ae623a0b5dedba514696752a`

See more details on using hashes here.

File details

Details for the file md_llm_lang_reader-0.1.0-py3-none-any.whl.

File metadata

Download URL: md_llm_lang_reader-0.1.0-py3-none-any.whl
Upload date: Dec 19, 2025
Size: 8.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for md_llm_lang_reader-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0f18c1fc9f773a5d6185c621ded0bd60e3a285f5fd2222975863617c3e370835`
MD5	`9a9564e4c1106ac0ad91c4e0980d9c71`
BLAKE2b-256	`1b9b8b5c4534e6799ceabc32b0188c80a1e62755983df8bac03170cfcdd375e9`

See more details on using hashes here.

md-llm-lang-reader 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

md-llm-lang-reader

Features

Installation

Quick start

CLI usage

Options

Examples

How it works

Notes on Text-to-Speech (TTS)

Markdown support (current)

Security

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes