Skip to main content

Page dewarping and thresholding using a cubic sheet model.

Project description

page-dewarp

uv PyPI Supported Python versions downloads License pre-commit.ci status

Document image dewarping library using a cubic sheet model.

Python 3 library for page dewarping and thresholding, available on PyPI.

A managed web version, Page Dewarp Web, is also available for use from any device with a browser, with an API for programmatic web access.

Installation

To install from PyPI, optionally using uv (recommended), run:

uv pip install page-dewarp

JAX

To install with JAX autodiff for ~11x faster optimisation on single images and ~33x faster for batches (CPU only), add the jax extra:

uv pip install page-dewarp[jax]

GPU support

To install with support for GPU execution instead of only CPU, choose one of:

uv pip install page-dewarp[jax-cuda12] # CUDA 12
uv pip install page-dewarp[jax-cuda13] # CUDA 13 (requires Python 3.11+)

Note: CPU execution is the default DEVICE and can be faster than GPU for this workload, but this may vary depending on your relative CPU/GPU horsepower (cores, RAM, VRAM, etc.)

Serial vs Batch

When the JAX backend is available, default behaviour when given multiple images is to use batch mode. Performance benchmark on 40 images (via #139):

Device Serial Batch Speedup
CPU 36s 8.7s 4.1x
GPU 53s 11.2s 4.7x

Dependencies

Python 3.10+ and NumPy, SciPy, SymPy, Matplotlib, OpenCV, and msgspec are required to run page-dewarp.

Documentation

See the docs site for full details, including a How It Works walkthrough of the algorithm and a configuration reference.

Usage

usage: page-dewarp [-h] [-d {0,1,2,3}] [-dd {file,screen,both}]
                   [-o OUTPUT_DIR] [-f OUTPUT_FORMAT] [-j OUTPUT_JSON]
                   [-it OPT_MAX_ITER] [-m OPT_METHOD] [-dev DEVICE]
                   [-b USE_BATCH] [-vw SCREEN_MAX_W] [-vh SCREEN_MAX_H]
                   [-x PAGE_MARGIN_X] [-y PAGE_MARGIN_Y] [-tw TEXT_MIN_WIDTH]
                   [-th TEXT_MIN_HEIGHT] [-ta TEXT_MIN_ASPECT]
                   [-tk TEXT_MAX_THICKNESS] [-tm TEXT_MORPH_OPS]
                   [-lm LINE_MORPH_OPS] [-wz ADAPTIVE_WINSZ] [-ri RVEC_IDX]
                   [-ti TVEC_IDX] [-ci CUBIC_IDX] [-sw SPAN_MIN_WIDTH]
                   [-sp SPAN_PX_PER_STEP] [-eo EDGE_MAX_OVERLAP]
                   [-el EDGE_MAX_LENGTH] [-ec EDGE_ANGLE_COST]
                   [-ea EDGE_MAX_ANGLE] [-fl FOCAL_LENGTH] [-z OUTPUT_ZOOM]
                   [-dpi OUTPUT_DPI] [-nb NO_BINARY] [-sh SHEAR_COST]
                   [-mc MAX_CORR] [-s REMAP_DECIMATE]
                   IMAGE_FILE_OR_FILES [IMAGE_FILE_OR_FILES ...]

positional arguments:
  IMAGE_FILE_OR_FILES   One or more images to process

options:
  -h, --help            show this help message and exit
  -d, --debug-level {0,1,2,3}
                        (type: int, default: 0)
  -dd, --debug-dest {file,screen,both}
                        (type: str, default: file)
  -o, --output-dir OUTPUT_DIR
                        Directory for output and debug images (type: str,
                        default: .)
  -f, --output-format OUTPUT_FORMAT
                        Output image format (e.g. png, tiff, bmp, jpeg) (type:
                        str, default: png)
  -j, --json OUTPUT_JSON
                        Write JSON sidecar with dewarp parameters (type: int,
                        default: 0)
  -it, --max-iter OPT_MAX_ITER
                        Maximum optimisation iterations (type: int, default:
                        600000)
  -m, --method OPT_METHOD
                        Name of the JAX/SciPy optimisation method to use.
                        (type: str, default: auto)
  -dev, --device DEVICE
                        Compute device to select for optimisation. (type: str,
                        default: auto)
  -b, --batch USE_BATCH
                        Whether to batch process images (JAX backend only).
                        (type: str, default: auto)
  -vw, --max-screen-width SCREEN_MAX_W
                        Viewing screen max width (for resizing to screen)
                        (type: int, default: 1280)
  -vh, --max-screen-height SCREEN_MAX_H
                        Viewing screen max height (for resizing to screen)
                        (type: int, default: 700)
  -x, --x-margin PAGE_MARGIN_X
                        Reduced px to ignore near L/R edge (type: int,
                        default: 50)
  -y, --y-margin PAGE_MARGIN_Y
                        Reduced px to ignore near T/B edge (type: int,
                        default: 20)
  -tw, --min-text-width TEXT_MIN_WIDTH
                        Min reduced px width of detected text contour (type:
                        int, default: 15)
  -th, --min-text-height TEXT_MIN_HEIGHT
                        Min reduced px height of detected text contour (type:
                        int, default: 2)
  -ta, --min-text-aspect TEXT_MIN_ASPECT
                        Filter out text contours below this w/h ratio (type:
                        float, default: 1.5)
  -tk, --max-text-thickness TEXT_MAX_THICKNESS
                        Max reduced px thickness of detected text contour
                        (type: int, default: 10)
  -tm, --text-morph TEXT_MORPH_OPS
                        Morphological ops for text mask (e.g. d_9_1,e_1_3)
                        (type: str, default: d_9_1,e_1_3)
  -lm, --line-morph LINE_MORPH_OPS
                        Morphological ops for line mask (e.g. e_3_1_3,d_8_2)
                        (type: str, default: e_3_1_3,d_8_2)
  -wz, --adaptive-winsz ADAPTIVE_WINSZ
                        Window size for adaptive threshold in reduced px
                        (type: int, default: 55)
  -ri, --rotation-vec-param-idx RVEC_IDX
                        Index of rvec in params vector (slice: pair of values)
                        (type: tuple[int, int], default: (0, 3))
  -ti, --translation-vec-param-idx TVEC_IDX
                        Index of tvec in params vector (slice: pair of values)
                        (type: tuple[int, int], default: (3, 6))
  -ci, --cubic-slope-param-idx CUBIC_IDX
                        Index of cubic slopes in params vector (slice: pair of
                        values) (type: tuple[int, int], default: (6, 8))
  -sw, --min-span-width SPAN_MIN_WIDTH
                        Minimum reduced px width for span (type: int, default:
                        30)
  -sp, --span-spacing SPAN_PX_PER_STEP
                        Reduced px spacing for sampling along spans (type:
                        int, default: 20)
  -eo, --max-edge-overlap EDGE_MAX_OVERLAP
                        Max reduced px horiz. overlap of contours in span
                        (type: float, default: 1.0)
  -el, --max-edge-length EDGE_MAX_LENGTH
                        Max reduced px length of edge connecting contours
                        (type: float, default: 100.0)
  -ec, --edge-angle-cost EDGE_ANGLE_COST
                        Cost of angles in edges (tradeoff vs. length) (type:
                        float, default: 10.0)
  -ea, --max-edge-angle EDGE_MAX_ANGLE
                        Maximum change in angle allowed between contours
                        (type: float, default: 7.5)
  -fl, --focal-length FOCAL_LENGTH
                        Normalized focal length of camera (type: float,
                        default: 1.2)
  -z, --output-zoom OUTPUT_ZOOM
                        How much to zoom output relative to *original* image
                        (type: float, default: 1.0)
  -dpi, --output-dpi OUTPUT_DPI
                        Just affects stated DPI of PNG, not appearance (type:
                        int, default: 300)
  -nb, --no-binary NO_BINARY
                        Disable output conversion to binary thresholded image
                        (type: int, default: 0)
  -sh, --shear-cost SHEAR_COST
                        Penalty against camera tilt (shear distortion). (type:
                        float, default: 0.0)
  -mc, --max-corrections MAX_CORR
                        Maximum corrections used to approximate the inverse
                        Hessian. (type: int, default: 100)
  -s, --shrink REMAP_DECIMATE
                        Downscaling factor for remapping image (type: int,
                        default: 16)

To try out an example image:

git clone https://github.com/lmmx/page-dewarp
cd page-dewarp
mkdir results && cd results
page-dewarp ../example_input/boston_cooking_a.jpg

See page-dewarp --help for the full list of options.

Background

This library was renovated from the original (2016) Python 2 script by Matt Zucker. A book on a flat surface can be modelled as a cubic curve fixed to zero at its endpoints (see Matt's original writeup and derive_cubic.py for the derivation).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

page_dewarp-0.3.4.tar.gz (41.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

page_dewarp-0.3.4-py3-none-any.whl (52.9 kB view details)

Uploaded Python 3

File details

Details for the file page_dewarp-0.3.4.tar.gz.

File metadata

  • Download URL: page_dewarp-0.3.4.tar.gz
  • Upload date:
  • Size: 41.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for page_dewarp-0.3.4.tar.gz
Algorithm Hash digest
SHA256 206f9ca5f9b81a9f29375d459d5f1e7017de472f5058208f11c98dbdc609ed1e
MD5 d101dc678e83344fcaa7dcaf7527fd46
BLAKE2b-256 3c2c78970ed9f31da7aff040865bfece3bea2284280271efff3615c00889169c

See more details on using hashes here.

Provenance

The following attestation bundles were made for page_dewarp-0.3.4.tar.gz:

Publisher: CI.yml on lmmx/page-dewarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file page_dewarp-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: page_dewarp-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 52.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for page_dewarp-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1fda3cbc101dcb722580923d76450579d1cf118ee436046fecbac74fabf08bc0
MD5 949f964fe96e14b763ba0f72c3cd4660
BLAKE2b-256 e88ab9ff237265b9410dd2d53c88b0086289f4d2dc560bf33025b6c10527e971

See more details on using hashes here.

Provenance

The following attestation bundles were made for page_dewarp-0.3.4-py3-none-any.whl:

Publisher: CI.yml on lmmx/page-dewarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page