Skip to main content

Use SAM2 to extract and warp a page from a photo.

Project description

Page Extractor

Extracts a page from a photo, and warps it to a rectangular image using skimage.

Features

  • Remove margin clutter from photos of pages, usually improved document processing.
  • GroundingDINO detection model integration.
  • SAM 2.1
  • Customizable text prompt, e.g. to "receipt." or "invoice."

Getting Started

Prerequisites

  • Python 3.10 or higher

Installation

Installing PyTorch Dependencies

Before installing pageextractor, install PyTorch:

pip install torch==2.4.1 torchvision==0.19.1 --extra-index-url https://download.pytorch.org/whl/cu124

Installation options

pip install -U git+https://github.com/UG-Team-Data-Science/pageextractor.git

Or:

git clone https://github.com/UG-Team-Data-Science/pageextractor && cd pageextractor
pip install -e .

Usage

from PIL import Image
from matplotlib import pyplot as plt

from pageextractor import PageExtractor

img = Image.open('example.png')
model = PageExtractor(sam_type='sam2.1_hiera_tiny', device='cuda')
mask, polygon, cropped = model.extract_page(img)

_, (ax0, ax1, ax2) = plt.subplots(1, 3, figsize=(30, 15))
ax0.imshow(img)

ax1.imshow(img)
ax1.plot(*polygon[[0,1,2,3,0]].T, 'r:')
ax1.imshow(1-mask, cmap='Blues', alpha=0.8 - 0.8*mask)

ax2.imshow(cropped)

Acknowledgments

This project is based on/used the following repositories:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pageextractor-0.1.3.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pageextractor-0.1.3-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file pageextractor-0.1.3.tar.gz.

File metadata

  • Download URL: pageextractor-0.1.3.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for pageextractor-0.1.3.tar.gz
Algorithm Hash digest
SHA256 60e6882edfa54794b77a6272099d149c2d3946e5eb41272f8d6290e65c3ab11c
MD5 072e3b211406aa00d7926642bc164aa7
BLAKE2b-256 022fb8d5ee9c8515f7111a0e45a81a18530b48d3d53c2860c0b10a5bd5ecc878

See more details on using hashes here.

File details

Details for the file pageextractor-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pageextractor-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 724ed46aa28c2ca91fbd652c2a7bf6a9b5987b3a749407a156c55acb695eb191
MD5 6b016f11217a7d6330b1d5f98938731e
BLAKE2b-256 06ee2a1d9c85639dbe564090c23963bae60fa0ee15a3eb4e6d753a1b8efdd862

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page