A package to help manage pdf pages, images and their conversions during different NLP, CV or other tasks to avoid repetitive code blocks and give a simple function call to make it happen

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

PdfSnipper

A package to help manage PDF pages, images, and their conversions during different NLP, CV, or other tasks to avoid repetitive code blocks and provide a simple function call for operations.

Installation

To install PdfSnipper, use:

pip install -i https://test.pypi.org/simple/ pdf-snip

Dependencies

If you face an error involving `poppler-utils`

For Google Colab:
```
!apt-get install -y poppler-utils
```
For Ubuntu/Debian:
```
sudo apt install poppler-utils
```
For Windows:
Download the latest release from here. After installation in /ProgramFiles, set the PATH environment variable:
```
import os
os.environ['PATH'] += os.pathsep + r'C:\path\to\poppler\bin'
```

Features

1. Remove First N Pages

Removes the first N pages from all PDFs in a folder.

remove_first_pages(input_folder: str, output_folder: str, pages_to_remove: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_remove: Number of pages to remove from the start.

Usage

from PDFSNIPPER import remove_first_pages
remove_first_pages('/content/input', '/content/output', 2)

2. Remove Last N Pages

Removes the last N pages from all PDFs in a folder.

remove_last_pages(input_folder: str, output_folder: str, pages_to_remove: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_remove: Number of pages to remove from the end.

Usage

from PDFSNIPPER import remove_last_pages
remove_last_pages('/content/input', '/content/output', 3)

3. Remove Pages Outside a Specified Range

Keeps only the pages within a specified range [start_page, end_page] inclusive, removing all others.

remove_pages_outside_range(input_folder: str, output_folder: str, start_page: int, end_page: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
start_page: First page to keep (0-indexed).
end_page: Last page to keep (0-indexed).

Usage

from PDFSNIPPER import remove_pages_outside_range
remove_pages_outside_range('/content/input', '/content/output', 2, 5)

4. Save Specific Pages

Saves only specific pages from PDFs into a new folder.

save_specific_pages(input_folder: str, output_folder: str, pages_to_save: list)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_save: List of page numbers (0-indexed) to keep.

Usage

from PDFSNIPPER import save_specific_pages
save_specific_pages('/content/input', '/content/output', [0, 2, 3])

5. Save Pages as Images

Saves specific pages as PNG images in a new folder.

save_pages_as_images(input_folder: str, output_folder: str, pages_to_save: list)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save PNG images.
pages_to_save: List of page numbers (0-indexed) to save as images.

Usage

from PDFSNIPPER import save_pages_as_images
save_pages_as_images('/content/input', '/content/output', [0, 2, 4])

6. Split PDF

Splits each page of a PDF into individual PDF files.

split_pdf(input_folder: str, output_folder: str)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save split PDFs.

Usage

from PDFSNIPPER import split_pdf
split_pdf('/content/input', '/content/output')

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.3

Feb 3, 2025

0.0.2

Jan 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_snip-0.0.3.tar.gz (5.1 kB view details)

Uploaded Feb 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdf_snip-0.0.3-py3-none-any.whl (5.3 kB view details)

Uploaded Feb 3, 2025 Python 3

File details

Details for the file pdf_snip-0.0.3.tar.gz.

File metadata

Download URL: pdf_snip-0.0.3.tar.gz
Upload date: Feb 3, 2025
Size: 5.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.1

File hashes

Hashes for pdf_snip-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`f3ba665b7b10c50196060bf104614fad0f9f46eb70f021434b5b7efe64187e9c`
MD5	`0b710782dc671d404b1b668614467bdb`
BLAKE2b-256	`87bba6210234efd6b6bd285c0be95b05fe7eea6d559ad565d3228fbc36266b54`

See more details on using hashes here.

File details

Details for the file pdf_snip-0.0.3-py3-none-any.whl.

File metadata

Download URL: pdf_snip-0.0.3-py3-none-any.whl
Upload date: Feb 3, 2025
Size: 5.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.1

File hashes

Hashes for pdf_snip-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`474f6cd44574b9369c2893c1c02720d2faf23d94368fc8f72de66a0599f58b38`
MD5	`dee389d370f80b010a7345b583af60cd`
BLAKE2b-256	`f1b494f79e7983902579b1e8e2ea8403d850080526ed0cff862cdc2ebecb63ba`

See more details on using hashes here.

pdf-snip 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PdfSnipper

Installation

Dependencies

If you face an error involving poppler-utils

Features

1. Remove First N Pages

Arguments

Usage

2. Remove Last N Pages

Arguments

Usage

3. Remove Pages Outside a Specified Range

Arguments

Usage

4. Save Specific Pages

Arguments

Usage

5. Save Pages as Images

Arguments

Usage

6. Split PDF

Arguments

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

If you face an error involving `poppler-utils`