Convert PDF files to HTML using pdfminer and BeautifulSoup
Project description
pdf2html
Convert PDF files to simple, readable HTML using a command-line tool.
Features
- Converts single PDFs or entire folders
- Retains original filenames
- Simple, semantic HTML output
- CLI-friendly and pip-installable
Installation
Option 1: Local install
pip install .
Option 2: pipx (recommended)
pipx install path/to/pdf2html/
Usage
Convert a single file:
pdf2html path/to/file.pdf -o output_folder
Convert all PDFs in a folder:
pdf2html path/to/folder -o output_folder
Requirements
- Python 3.8+
pdfminer.sixbeautifulsoup4
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdf2html-0.1.0.tar.gz
(3.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf2html-0.1.0.tar.gz.
File metadata
- Download URL: pdf2html-0.1.0.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d883ba71bf23b3528fcd945712e3118809a931a14d4eaac4e9f9d8e14601284e
|
|
| MD5 |
07fb1210cb0326154743acfe8e67f641
|
|
| BLAKE2b-256 |
f191d2a03636af135ea260646e1b6ffd089848830d76aa7d5f975ea8c7f708d2
|
File details
Details for the file pdf2html-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pdf2html-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09ea262912ce73e5fd2105272abd8c2b2312e31fae6fd9a298da7e8ef3accf6c
|
|
| MD5 |
74522a42d923cf318879bc7650a2a8be
|
|
| BLAKE2b-256 |
3dcf83a9ecb93f007846ae8f054def2a4ea6f95140e2d7bd9d624ea01f321dd0
|