Skip to main content

A comprehensive toolkit for PDF manipulation

Project description

python-pdf-toolkit

A comprehensive Python package for PDF manipulation including compression, conversion to Excel/Word, encryption/decryption, and merging.

Features

  • PDF Compression: Reduce PDF file size while maintaining quality
  • PDF to Excel Conversion: Extract tables from PDFs and convert to Excel format
  • PDF to Word Conversion: Convert PDFs to editable Word documents
  • PDF Merging: Combine multiple PDFs into a single document
  • PDF Encryption/Decryption: Secure your PDFs with password protection
  • Logging Support: Console logging and Discord webhook integration

Installation

# Basic installation
pip install python-pdf-toolkit

# With Excel conversion support
pip install python-pdf-toolkit[excel]

# With Word conversion support
pip install python-pdf-toolkit[word]

# With Discord logging support
pip install python-pdf-toolkit[discord]

# With all optional dependencies
pip install python-pdf-toolkit[all]

Usage

Python API

from python_pdf_toolkit import PDFToolkit

# Initialize the toolkit
toolkit = PDFToolkit()

# Compress a PDF
compressed_pdf = toolkit.compressor.compress(
    "input.pdf", 
    "compressed.pdf",
    compression_level=7
)

# Convert PDF to Excel
excel_data = toolkit.excel_converter.convert(
    "input.pdf", 
    "output.xlsx"
)

# Convert PDF to Word
word_doc = toolkit.word_converter.convert(
    "input.pdf", 
    "output.docx"
)

# Merge PDFs
merged_pdf = toolkit.merger.merge(
    ["file1.pdf", "file2.pdf", "file3.pdf"], 
    "merged.pdf"
)

# Encrypt a PDF
encrypted_pdf = toolkit.encryptor.encrypt(
    "input.pdf", 
    "your_password", 
    "encrypted.pdf"
)

# Decrypt a PDF
decrypted_pdf = toolkit.encryptor.decrypt(
    "encrypted.pdf", 
    "your_password", 
    "decrypted.pdf"
)

Logging

PDFToolkit provides flexible logging options:

from python_pdf_toolkit.logger import setup_logger

# Set up a standard console logger
logger = setup_logger(
    name="MyPDFApp",
    level="INFO"  # Available levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
)

# Log messages
logger.info("Starting PDF processing")
logger.warning("Low disk space for output file")
logger.error("Failed to process PDF file")

# Set up Discord webhook logging (requires discord-logger-handler package)
discord_logger = setup_logger(
    name="MyPDFApp",
    level="INFO",
    discord_webhook="https://discord.com/api/webhooks/your_webhook_url"
)

# Log messages with additional context
discord_logger.info("PDF processed successfully", file_name="example.pdf", pages=5)
discord_logger.error("Processing failed", error_code=500, file_path="/path/to/file.pdf")

Command Line Interface

# Compress a PDF
pdftoolkit compress input.pdf compressed.pdf --level 7

# Convert PDF to Excel
pdftoolkit to-excel input.pdf output.xlsx

# Convert PDF to Word
pdftoolkit to-word input.pdf output.docx

# Merge PDFs
pdftoolkit merge file1.pdf file2.pdf file3.pdf merged.pdf

# Encrypt a PDF
pdftoolkit encrypt input.pdf encrypted.pdf --password your_password

# Decrypt a PDF
pdftoolkit decrypt encrypted.pdf decrypted.pdf --password your_password

Optional Dependencies

  • pdfplumber: Required for PDF to Excel conversion
  • pdf2docx: Required for PDF to Word conversion
  • discord-logger-handler: Required for Discord logging integration

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_pdf_toolkit-0.1.1.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_pdf_toolkit-0.1.1-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file python_pdf_toolkit-0.1.1.tar.gz.

File metadata

  • Download URL: python_pdf_toolkit-0.1.1.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for python_pdf_toolkit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 eb204514eb1a580f9fe68256d27ee67e512303770f37ed979871cec9cb19e5ba
MD5 a5a6f7b216a85869254b1c926ea8d8ac
BLAKE2b-256 13618e5e109aae4091954ab4103433ca4f5dbd3835bc95f9e280acdc7dff1b88

See more details on using hashes here.

File details

Details for the file python_pdf_toolkit-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for python_pdf_toolkit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d4634ca226901bdfbf5b69043382fdf87fdecba36a2032b3c690ef466deb0442
MD5 7b96f374d14a18d3598497249450ff1a
BLAKE2b-256 8298054516feb805d1a4cff0a48755265b3d89dacb50fbf5b684a73bf166692c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page