Skip to main content

HTML page splitter (preserves tags), text page splitter (natural breaks), and chapter detector for pure text.

Project description

Build Status Coverage

pagesmith

Splitting HTML into pages, preserving HTML tags while respecting the original document structure. Utilize blazingly fast lxml parser.

Splitting pure text into pages at natural break points such as paragraphs or sentences.

Detect chapters in pure text to create a Table of Contents.

Documentation

Pagesmith

Developers

Do not forget to run . ./activate.sh.

For work it need uv installed.

Use pre-commit hooks for code quality:

pre-commit install

Allure test report

Scripts

Install invoke preferably with pipx:

pipx install invoke

For a list of available scripts run:

invoke --list

For more information about a script run:

invoke <script> --help

Coverage report

Created with cookiecutter using template

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pagesmith-1.6.0.tar.gz (115.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pagesmith-1.6.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file pagesmith-1.6.0.tar.gz.

File metadata

  • Download URL: pagesmith-1.6.0.tar.gz
  • Upload date:
  • Size: 115.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pagesmith-1.6.0.tar.gz
Algorithm Hash digest
SHA256 c35b159786670ad9d547aed9f78e74b6fef1de8e22b4c25f5c79d2fe8e4fdfef
MD5 182e392fd2aa1c72e4fc5fb358bef64e
BLAKE2b-256 384467d92afd0f5ba1102ea68cb371675d22be37022809403a20ff436cd8438e

See more details on using hashes here.

File details

Details for the file pagesmith-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: pagesmith-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pagesmith-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e4a2ac32ce4cd9d1c8c26807c91bf5ed9318d3c6486eadafba849f19faa500f2
MD5 a7046e8c8d223dc68a5263ea93e98251
BLAKE2b-256 00601670d0aeae79d5a832dfdf09a70ee0b658115c1e9f5c28f938ebd6e7eea1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page