pdf2slides

Convert a PDF file into editable slides in .pptx format

These details have not been verified by PyPI

Project links

Homepage

Project description

Project Description

pdf2slides is an open-source Python library designed for efficient conversion of PDF files into editable slides. With just three lines of code, it supports both machine-generated PDF files with selectable text and scanned PDF files, although the latter is currently experimental. The library outputs slides in the .pptx format, making it a versatile tool for automating presentation creation and handling various types of PDF content. Ideal for integrating into applications or scripts, pdf2slides offers a straightforward solution for transforming PDF documents into PowerPoint presentations.

Installation

To install pdf2slides, use pip:

pip install pdf2slides

Usage

from pdf2slides import Converter

# Create an instance of Converter
converter = Converter()

# Convert a PDF file to slides
converter.convert('input_file.pdf', 'output_file.pptx')

Advanced Usage

You can customize the Converter instance with various parameters:

Specifying a Default Font: Use this parameter to set a font for OCR'ed text. The font must be installed on your system.

from pdf2slides import Converter

converter = Converter(default_font='Arial')
converter.convert('input_file.pdf', 'output_file_with_arial_font.pptx')

Enabling OCR Mode: Enable OCR for converting scanned PDFs. This feature is experimental and is not recommended for use with PDF files known to have selectable text.

from pdf2slides import Converter

converter = Converter(enable_ocr=True)
converter.convert('scanned_file.pdf', 'output_file.pptx')

Enforcing Default Font: Ensure the output slides use the default font, even when the input file is not scanned.

from pdf2slides import Converter

converter = Converter(default_font='Arial', enforce_default_font=True)
converter.convert('not_scanned_file.pdf', 'output_file_with_arial_font.pptx')

Manually Setting Image Retention Level: Adjust the threshold for keeping pure-text images when OCR is enabled.

from pdf2slides import Converter

# Set the image retention level to a lower value when non-editable text remains in the output as images.
converter = Converter(enable_ocr=True, image_retention_level=0.3)
converter.convert('scanned_file.pdf', 'output_file.pptx')

Multilingual Support: Specify the language for OCR processing. The default setting supports English and Chinese. For a list of supported languages, see PaddleOCR Multi-language Model.

from pdf2slides import Converter

converter = Converter(enable_ocr=True, lang='fr')
converter.convert('scanned_file_in_french.pdf', 'output_file.pptx')

License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Aug 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2slides-0.1.0.tar.gz (22.5 kB view details)

Uploaded Aug 17, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdf2slides-0.1.0-py3-none-any.whl (21.9 kB view details)

Uploaded Aug 17, 2024 Python 3

File details

Details for the file pdf2slides-0.1.0.tar.gz.

File metadata

Download URL: pdf2slides-0.1.0.tar.gz
Upload date: Aug 17, 2024
Size: 22.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for pdf2slides-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6308ef8017076f73b22028b2da9a820853cf84601d750d1f3b4e843407a0ae0c`
MD5	`23da720fbb5c807823d616e3337c794a`
BLAKE2b-256	`ff988d782a374f8b236f2b89bfece8afbacf5ceb4b4ac1b3c28b32e5527667a9`

See more details on using hashes here.

File details

Details for the file pdf2slides-0.1.0-py3-none-any.whl.

File metadata

Download URL: pdf2slides-0.1.0-py3-none-any.whl
Upload date: Aug 17, 2024
Size: 21.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for pdf2slides-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2bb52778808534eb7e0df3b134e75188b28e4ef1c8cb67f953957326f8462bd0`
MD5	`f6fc38329fef23c68d79c987ddf9452f`
BLAKE2b-256	`fb25b7646ec35ec0dd6c4c3cf2e548073fef8c8139c411af3addee125ecacbb1`

See more details on using hashes here.

pdf2slides 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project Description

Installation

Usage

Advanced Usage

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes