Skip to main content

Create and safely rebuild EPUB files from structured text and Markdown.

Project description

PyPI - Version PyPI - Python Version PyPI - Downloads codecov

text2epub

text2epub is a typed Python library and CLI for creating EPUB files from plain-text writing workflows. It is useful when you keep a manuscript as one Markdown file, as a folder of numbered Markdown chapters, or as already-rendered XHTML chapter fragments.

It also includes a conservative rebuild workflow for tools such as booktx that need to apply validated text replacements to an existing EPUB without rewriting unchanged package entries.

Features

  • create new EPUBs from one Markdown file
  • create new EPUBs from a folder of ordered Markdown files
  • discover chapters from filename-based manuscript conventions
  • generate XHTML, OPF, NAV, NCX, CSS, and deterministic ZIP output
  • optionally add a generated title page and reader-visible contents page
  • request automatic TOC page numbers with CSS target-counter() for readers that support paged-media counters
  • package local image assets referenced by Markdown
  • optionally preserve safe inline XHTML in Markdown, such as <em>, <strong>, <span>, and <a>
  • support YAML-like front matter for common EPUB metadata
  • build EPUBs from explicit XHTML chapter bodies
  • safely rebuild existing EPUBs from extraction manifests and replacement plans
  • byte-identical no-op and identity rebuild paths
  • basic EPUB package validation and unresolved-token checks

Installation

uv pip install text2epub

For development from a checkout:

python -m pip install -e .

Folder-based Markdown workflow

A simple manuscript folder can use filenames to define reading order:

manuscript/
├── 00-front-matter.md
├── 01-introduction.md
├── 02-method.md
└── 03-appendix.md

The first file may contain front matter for book metadata:

---
title: Example Book
language: en
author: Ada Lovelace
publisher: Example Press
date: 2026-06-22
---

# Introduction

This becomes the first EPUB chapter.

Build it from Python with the convenience API:

from pathlib import Path

from text2epub import BuildOptions, create_epub_from_markdown_folder

create_epub_from_markdown_folder(
    Path("manuscript"),
    Path("book.epub"),
    options=BuildOptions(
        include_title_page=True,
        include_toc_page=True,
        toc_page_numbers=True,
    ),
)

Or build the same folder from the CLI:

text2epub markdown manuscript/ -o book.epub --title-page --toc-page --toc-page-numbers

Explicit Python API

Use create_epub_from_markdown_files when your application already controls the chapter list and order:

from pathlib import Path

from text2epub import BuildOptions, EpubMetadata, create_epub_from_markdown_files

create_epub_from_markdown_files(
    [Path("01-introduction.md"), Path("02-body.md")],
    Path("book.epub"),
    metadata=EpubMetadata(title="Example Book", language="en"),
    options=BuildOptions(include_title_page=True, include_toc_page=True),
)

Use the lower-level model API when you need per-chapter ids, hrefs, titles, or custom build options:

from pathlib import Path

from text2epub import (
    BuildOptions,
    EpubMetadata,
    MarkdownBook,
    MarkdownChapter,
    create_epub_from_markdown,
)

book = MarkdownBook(
    metadata=EpubMetadata(title="Example Book", language="en"),
    chapters=[MarkdownChapter(path=Path("chapter-01.md"))],
    options=BuildOptions(deterministic=True),
)
create_epub_from_markdown(book, Path("book.epub"))

Safe inline XHTML in Markdown

Raw HTML is escaped by default. When your source text comes from epub2text structured fragment export or another trusted pipeline, enable safe inline XHTML to preserve phrasing markup inside Markdown paragraphs:

This keeps <em>emphasis</em> and <strong>strength</strong>.
from text2epub import BuildOptions

options = BuildOptions(allow_inline_xhtml=True)
text2epub markdown manuscript/ -o book.epub --allow-inline-xhtml

Only safe inline tags are accepted. Raw block HTML and unsafe attributes such as onclick are rejected.

Rebuild API

from pathlib import Path

from text2epub import Replacement, ReplacementPlan, rebuild_epub

report = rebuild_epub(
    ReplacementPlan(
        source_epub=Path("source.epub"),
        extraction_manifest=Path("manifest.json"),
        replacements=[
            Replacement(
                block_id="spine-0001:block-000001",
                text="Translated paragraph.",
            )
        ],
    ),
    Path("rebuilt.epub"),
)
print(report.changed_entries)

CLI

text2epub markdown INPUT.md -o OUTPUT.epub
text2epub markdown CHAPTER_DIR -o OUTPUT.epub --title "Book" --language en
text2epub markdown CHAPTER_DIR -o OUTPUT.epub --title-page --toc-page --toc-page-numbers
text2epub markdown CHAPTER_DIR -o OUTPUT.epub --allow-inline-xhtml
text2epub rebuild SOURCE.epub MANIFEST.json REPLACEMENTS.json -o OUTPUT.epub
text2epub validate OUTPUT.epub
text2epub version

Documentation

The Sphinx documentation lives under docs/. Build it locally with:

python -m pip install -e ".[docs]"
python -m sphinx -b html docs docs/_build/html

Start with docs/index.md for the user guide and docs/release-checklist.md for release validation.

Page numbers in EPUB readers

EPUB files do not contain universal static page numbers. When toc_page_numbers=True or --toc-page-numbers is used, text2epub writes CSS using target-counter() on the generated contents page. Reading systems with paged-media counter support can fill those numbers automatically; other readers still display the linked contents entries without page numbers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text2epub-0.1.2.tar.gz (57.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

text2epub-0.1.2-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file text2epub-0.1.2.tar.gz.

File metadata

  • Download URL: text2epub-0.1.2.tar.gz
  • Upload date:
  • Size: 57.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for text2epub-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a93bebda39ad76080563bb9abb2eff29ad85a141b8bed48b580daeb675db7dce
MD5 9930c07fd2f403f1352eed1275257e7b
BLAKE2b-256 c9fff0a8c71ff26092a4a0f5a56f1fe4d15211fe5495ac89236de24ca7206042

See more details on using hashes here.

File details

Details for the file text2epub-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: text2epub-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for text2epub-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5387ff90696b209ab9fe6c7dadd8ded87312c38fc8275221ed2c6ad0bec5c1b8
MD5 125a6c1ff1e0b96a29bd32e30fb5011d
BLAKE2b-256 117501b605be310284690af27deb069e793f4d4c8c535c56b4638975d52182af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page