Skip to main content

Convert PDF files to HTML using pdfminer and BeautifulSoup

Project description

pdf2html

CI codecov License: MIT PyPI version Docker

Convert PDF files to simple, readable HTML using a command-line tool.

Features

  • Converts single PDFs or entire folders
  • Retains original filenames
  • Simple, semantic HTML output
  • CLI-friendly and pip-installable

Installation

Option 1: Local install

pip install .

Option 2: pipx (recommended)

pipx install path/to/pdf2html/

Usage

Convert a single file:

pdf2html path/to/file.pdf -o output_folder

Convert all PDFs in a folder:

pdf2html path/to/folder -o output_folder

Requirements

  • Python 3.8+
  • pdfminer.six
  • beautifulsoup4

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2html-0.1.0.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf2html-0.1.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file pdf2html-0.1.0.tar.gz.

File metadata

  • Download URL: pdf2html-0.1.0.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for pdf2html-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d883ba71bf23b3528fcd945712e3118809a931a14d4eaac4e9f9d8e14601284e
MD5 07fb1210cb0326154743acfe8e67f641
BLAKE2b-256 f191d2a03636af135ea260646e1b6ffd089848830d76aa7d5f975ea8c7f708d2

See more details on using hashes here.

File details

Details for the file pdf2html-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pdf2html-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for pdf2html-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 09ea262912ce73e5fd2105272abd8c2b2312e31fae6fd9a298da7e8ef3accf6c
MD5 74522a42d923cf318879bc7650a2a8be
BLAKE2b-256 3dcf83a9ecb93f007846ae8f054def2a4ea6f95140e2d7bd9d624ea01f321dd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page