Advanced PDF to PowerPoint converter with high fidelity

These details have not been verified by PyPI

Project links

Project description

PDFToPPT - Advanced PDF to PowerPoint Converter

A high-fidelity Python package for converting PDF documents to PowerPoint presentations. This tool preserves layouts, images, text formatting, and vector graphics with exceptional accuracy.

Features

High Fidelity Conversion: Preserves original PDF layouts, fonts, colors, and formatting
Vector Graphics Support: Converts PDF vector elements (lines, rectangles) to PowerPoint shapes
Image Preservation: Extracts and embeds images with transparency support
Text Formatting: Maintains font styles, sizes, colors, bold, and italic formatting
Custom Page Ranges: Convert specific pages or page ranges
Batch Processing: Process multiple pages efficiently
Command Line Interface: Easy-to-use CLI for automation
Python API: Full programmatic access for integration

Installation

From PyPI (Recommended)

pip install pdftoppt

From Source

git clone https://github.com/amitpanda007/pdftoppt.git
cd pdftoppt
pip install -e .

Dependencies

Python 3.7+
PyMuPDF (fitz) >= 1.20.0
python-pptx >= 0.6.18
Pillow >= 8.0.0

Quick Start

Command Line Usage

# Convert entire PDF
pdftoppt input.pdf output.pptx

# Convert specific page range
pdftoppt input.pdf output.pptx --pages 1-5

# Convert with verbose logging
pdftoppt input.pdf output.pptx --verbose

# Get help
pdftoppt --help

Python API Usage

from pdftoppt import AdvancedPDFToPowerPointConverter

# Basic conversion
with AdvancedPDFToPowerPointConverter() as converter:
    success = converter.convert("input.pdf", "output.pptx")
    print(f"Conversion successful: {success}")
    print(f"Slides created: {converter.slides_created}")

# Convert specific pages
with AdvancedPDFToPowerPointConverter() as converter:
    success = converter.convert(
        pdf_path="input.pdf",
        output_path="output.pptx",
        page_range=(1, 5)  # Convert pages 1-5
    )

# With error handling
try:
    converter = AdvancedPDFToPowerPointConverter()
    converter.convert("input.pdf", "output.pptx")
except FileNotFoundError:
    print("PDF file not found")
except ValueError as e:
    print(f"Invalid parameters: {e}")
finally:
    converter._cleanup_temp_files()

Advanced Usage

Logging Configuration

import logging
from pdftoppt import AdvancedPDFToPowerPointConverter

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)

converter = AdvancedPDFToPowerPointConverter()
converter.convert("input.pdf", "output.pptx")

Batch Processing

import os
from pathlib import Path
from pdftoppt import AdvancedPDFToPowerPointConverter

def batch_convert_pdfs(input_dir, output_dir):
    \"\"\"Convert all PDFs in a directory.\"\"\"
    input_path = Path(input_dir)
    output_path = Path(output_dir)
    output_path.mkdir(exist_ok=True)

    with AdvancedPDFToPowerPointConverter() as converter:
        for pdf_file in input_path.glob("*.pdf"):
            output_file = output_path / f"{pdf_file.stem}.pptx"
            try:
                success = converter.convert(str(pdf_file), str(output_file))
                print(f"{'✓' if success else '✗'} {pdf_file.name}")
            except Exception as e:
                print(f"✗ {pdf_file.name}: {e}")

# Usage
batch_convert_pdfs("./pdfs", "./presentations")

How It Works

The converter uses a multi-step process to ensure high-fidelity conversion:

PDF Analysis: Extracts text, images, and vector graphics from each PDF page
Element Processing: Processes fonts, colors, positioning, and formatting
PowerPoint Generation: Creates custom-sized presentation matching PDF dimensions
Content Reconstruction: Rebuilds all elements as native PowerPoint objects

Supported Elements

✅ Text with formatting (fonts, sizes, colors, bold, italic)
✅ Images (JPEG, PNG) with transparency support
✅ Vector graphics (rectangles, lines)
✅ Colors and fill patterns
✅ Positioning and layouts
⚠️ Complex vector paths (simplified to basic shapes)
❌ Interactive elements (forms, hyperlinks)
❌ Animations and transitions

Performance

Typical conversion speeds:

Simple text documents: ~2-5 pages/second
Image-heavy documents: ~0.5-2 pages/second
Complex mixed content: ~1-3 pages/second

Memory usage scales with document complexity and image content.

Troubleshooting

Common Issues

Import Error for PyMuPDF:

pip install --upgrade PyMuPDF

Memory issues with large PDFs:

# Process in smaller page ranges
for start in range(1, total_pages, 10):
    end = min(start + 9, total_pages)
    converter.convert("large.pdf", f"output_part_{start}.pptx",
                     page_range=(start, end))

Font rendering issues:

Ensure system has required fonts installed
Check PDF for embedded fonts

Debug Mode

Enable verbose logging to diagnose issues:

pdftoppt input.pdf output.pptx --verbose

API Reference

AdvancedPDFToPowerPointConverter

Methods

__init__()

Initializes converter with temporary directory for processing

convert(pdf_path, output_path, page_range=None)

Main conversion method
Parameters:
- pdf_path (str): Path to input PDF file
- output_path (str): Path for output PowerPoint file
- page_range (tuple, optional): (start_page, end_page) for partial conversion
Returns: bool - True if successful
Raises: FileNotFoundError, ValueError

Context Manager Support:

with AdvancedPDFToPowerPointConverter() as converter:
    converter.convert("input.pdf", "output.pptx")
# Automatic cleanup

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/amitpanda007/pdftoppt.git
cd pdftoppt
pip install -e ".[dev]"

Running Tests

pytest tests/

Code Quality

black pdftoppt/
flake8 pdftoppt/
mypy pdftoppt/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

v1.0.0

Initial release
High-fidelity PDF to PowerPoint conversion
Support for text, images, and vector graphics
Command-line interface
Python API with context manager support

Support

Acknowledgments

Built with PyMuPDF for PDF processing
Uses python-pptx for PowerPoint generation
Image processing powered by Pillow

Made with ❤️ for the Python community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Sep 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdftoppt-1.0.0.tar.gz (23.2 kB view details)

Uploaded Sep 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdftoppt-1.0.0-py3-none-any.whl (12.1 kB view details)

Uploaded Sep 6, 2025 Python 3

File details

Details for the file pdftoppt-1.0.0.tar.gz.

File metadata

Download URL: pdftoppt-1.0.0.tar.gz
Upload date: Sep 6, 2025
Size: 23.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.9

File hashes

Hashes for pdftoppt-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`53a811c00dd7cf1a323c1c26e533d457ab54c8876e04062f869e9e342add7340`
MD5	`4e944beab5b9db594bd9c28a95241bfc`
BLAKE2b-256	`b379198349649027c7896d700adca3e0eeb9977e5ce85128ee61e67231747150`

See more details on using hashes here.

File details

Details for the file pdftoppt-1.0.0-py3-none-any.whl.

File metadata

Download URL: pdftoppt-1.0.0-py3-none-any.whl
Upload date: Sep 6, 2025
Size: 12.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.9

File hashes

Hashes for pdftoppt-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9745da32ba08a8a1be0056f2e57f00d32435561de55a78b25a0f66e95d41816`
MD5	`25e8b69420446439c01455bcaebfcb3c`
BLAKE2b-256	`eb247b9994e5edc5d394fcd4be8c91949c3b990febfe407d345b62be52c9f00a`

See more details on using hashes here.

pdftoppt 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PDFToPPT - Advanced PDF to PowerPoint Converter

Features

Installation

From PyPI (Recommended)

From Source

Dependencies

Quick Start

Command Line Usage

Python API Usage

Advanced Usage

Logging Configuration

Batch Processing

How It Works

Supported Elements

Performance

Troubleshooting

Common Issues

Debug Mode

API Reference

AdvancedPDFToPowerPointConverter

Methods

Contributing

Development Setup

Running Tests

Code Quality

License

Changelog

v1.0.0

Support

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes