A lightweight GUI tool to visualize and extract images from PDF files
Project description
PDF Image Extractor
A lightweight Python application with a GUI to visualize and extract images from PDF files. The app displays PDF pages with red bounding boxes around detected images, allowing you to click on any box to save that image.
Features
- 📄 Visual PDF Navigation: Browse through PDF pages with Previous/Next buttons or jump to specific pages
- 🎯 Popup Image Preview: Hover over image descriptions to see a popup with the actual extracted image
- 🖱️ Click to Save: Click on image descriptions in the list to extract and save
- 📑 Smart Navigation:
- Outline Tab: PDF table of contents with hierarchical navigation
- Thumbnails Tab: Visual page browser with clickable thumbnails
- 🔍 Zoom Controls: Zoom in/out with +/- buttons or fit page to window
- 📊 Image Info Panel: Interactive list showing all images on the current page
- 💾 Batch Extract: Extract all images from the current page at once
- 🚀 Pure Python: No external system dependencies required
Installation
From Source
Clone or download this repository, then install:
pip install -e .
From PyPI
pip install pdf-image-extractor
That's it! No system dependencies needed - PyMuPDF includes everything required.
Usage
After installation, run from anywhere using the command:
pdf-image-extractor
Or open a PDF directly from the command line:
pdf-image-extractor /path/to/document.pdf
Or run directly from the source directory:
python -m pdf_image_extractor.app [optional-pdf-file]
Quick Start Guide
- Open a PDF: Click "Open PDF" button or provide a file path as argument
- Navigate:
- Use Previous/Next (◀ ▶) buttons or type a page number
- Click on entries in the Outline tab for TOC navigation
- Switch to Thumbnails tab for visual page browsing
- View Images: Check the right panel for a list of all images on the current page
- Preview: Hover your mouse over any image in the list to see a popup preview
- Extract Image: Click on an image in the list to save it
- Zoom: Use +/- buttons to zoom, or click "Fit" to fit page to window (auto-fits on open)
- Batch Extract: Click "Extract All" to save all images from the current page
Development
Install in Development Mode
pip install -e ".[dev]"
License
MIT License - Feel free to use and modify as needed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file extract_pdf_images-1.0.0.tar.gz.
File metadata
- Download URL: extract_pdf_images-1.0.0.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e868824282f2554a7ac8a690ed390481053465539642f40eb23194358afa1f7
|
|
| MD5 |
96ebe2bbb31ea7addf019b6ecb224cce
|
|
| BLAKE2b-256 |
e8a207ac2958cdd0257273291b1383a290b6cce65e8c6b070a84cfd305a8db8a
|
File details
Details for the file extract_pdf_images-1.0.0-py3-none-any.whl.
File metadata
- Download URL: extract_pdf_images-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5ec1df373067d9074d775f40d11bc504929087a6a9e99f5d392733b767e252d
|
|
| MD5 |
b29d9ddf9e35decf7f51d2fe3a5223a2
|
|
| BLAKE2b-256 |
73f62e2d6a44e08a33643cf661990c4821c6e6287d508390b15618df5a3bbc34
|