Skip to main content

Convert semantic HTML to dyslexia-friendly readable HTML

Project description

Decant

Decant is a free, open-source tool that converts web articles into clean, accessible HTML documents styled for readers with dyslexia and related conditions. It strips site chrome, extracts article content, and produces a single portable document with accessibility-focused typography -- printable, offline-readable, and ready to hand to a student or email to a teacher.

Many web pages are technically readable but visually fatiguing for dyslexic readers due to cramped line length, poor spacing, and layouts that prioritize appearance over readability. Browser Reader Mode helps but is ephemeral -- you cannot save it, print it with controlled typography, or share it reliably. Decant produces actual documents.

Try it

Web: decant.cc -- paste a URL or upload an HTML file.

CLI:

pip install decant-cli
decant input.html
decant input.html -o output.html
decant input.html --font opendyslexic

Print to PDF: open the output in a browser and use print-to-PDF.

What it handles

  • Articles, blog posts, and long-form prose with semantic HTML
  • Headings, paragraphs, lists, blockquotes, code blocks, images, tables
  • Inline emphasis, strong, code, and links
  • Automatic content extraction via Trafilatura
  • Strict sanitization of active content

What it does not handle

  • JavaScript-rendered pages or SPAs
  • Login-required content (save the page as HTML and upload it instead)
  • PDF, DOCX, or Markdown input
  • Layout-heavy or table-dominant reference pages

These are documented boundaries, not defects. When Decant encounters content outside its scope, it says so clearly.

Output

A single self-contained HTML file:

  • No external CSS, fonts, or scripts
  • Images reference original source URLs
  • OpenDyslexic font embedded when enabled
  • Designed for screen, print, and offline reading

Testing

  • Unit and integration tests (pytest)
  • 47-fixture real-world corpus with metrics-based regression tracking
  • ScrapingHub Article Extraction Benchmark (181 pages)
  • Webis Web Content Extraction Benchmark (3,985 pages)

Documentation

Status

Pre-release. Live at decant.cc for early testing. Published on PyPI as decant-cli.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

decant_cli-0.1.2.tar.gz (241.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

decant_cli-0.1.2-py3-none-any.whl (241.6 kB view details)

Uploaded Python 3

File details

Details for the file decant_cli-0.1.2.tar.gz.

File metadata

  • Download URL: decant_cli-0.1.2.tar.gz
  • Upload date:
  • Size: 241.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for decant_cli-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e643637ada845b10f8dcd6f70bdaa83e163e4f35b6018a0bb5c7493f18a30adb
MD5 52b814dac49b763283c2b71c055ab6e7
BLAKE2b-256 cb8ae83b68d163539d531ceffad33ec35d61baf97e38a01e9456ca4d6e81ce73

See more details on using hashes here.

File details

Details for the file decant_cli-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: decant_cli-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 241.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for decant_cli-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2b245d5ef6e8a025414e0bd1715ae46d39fc67e23d0b621b62977e895e14b1ca
MD5 121a40cb585a290389f0c9c220fa5824
BLAKE2b-256 de5695fe12b877febb8f58c34a42febca18f6f16371e49cfa60ce528213e852c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page