Skip to main content

A robust Python wrapper for high-performance PDF rendering and text extraction using C++ core.

Project description

WinnerZ Python Library Documentation

Overview

The winnerz library is a robust Python wrapper designed for processing, rendering, and manipulating PDF documents. It relies on a high-performance C++ core extension (winnerz_core) for intensive operations while providing seamless fallback mechanisms and caching strategies in Python.

The architecture emphasizes reliability and fault-tolerance, specifically in handling binary dependencies, dynamic core library loading, and flexible preview rendering backends (PDFium and Playwright).

Architecture

The system is divided into several conceptual layers:

  1. Core Loader & Diagnostics: Handles the dynamic importing of the C++ binary (winnerz_core), including binary size verification, truncation repair, and Windows DLL directory management.
  2. Document Object Model: Provides Pythonic abstractions (Document, Page) to interact with PDF files, managing resources and state safely.
  3. Rendering Pipeline: Integrates the C++ rendering engine with fallback Python-based preview engines using pypdfium2 or playwright.
  4. Geometry & Data Structures: Implements domain-specific types (Rect, Matrix, Pixmap) to standardize data flow between the C++ layer and Python runtime.

Core Loading Mechanism

The library initializes the C++ binary through _load_core(). This system provides the following safety guarantees:

  • Thread Safety: Uses threading.Lock() to ensure the core is initialized exactly once.
  • Retry Logic: Implements a retry loop (_CORE_IMPORT_RETRIES = 3) to mitigate transient filesystem or OS-level loading issues.
  • Self-Healing: If a truncated binary is detected (e.g., due to an interrupted build or copy), _try_repair_truncated_core_binary() attempts to restore it from other valid candidate binaries in the directory.
  • Diagnostic Reporting: Generates detailed error messages specifying binary ABI mismatches (e.g., GLIBC mismatches) or binary sizes to accelerate debugging.

Environment Variables

  • WINNERZ_PREVIEW_BACKEND: Controls the backend used for rendering preview data when the C++ core returns placeholder data.
    • Valid values: auto (default), pdfium, playwright.
    • Resolution order for auto: Falls back from PDFium to Playwright based on availability.

Class Reference

Document

Represents a PDF document instance. It manages the lifecycle of the underlying file, including automatic decryption via a temporary file using PDFium if the document is encrypted.

Constructor:

  • Document(path): Resolves the path, checks for encryption, and determines the page count via the C++ core.

Methods:

  • __getitem__(index): Retrieves a Page object at the specified 0-based index. Supports negative indexing.
  • __len__(): Returns the total number of pages in the document.
  • close(): Cleans up temporary resources, such as decrypted temporary files.

Page

Represents a single page within a Document.

Methods:

  • get_text(mode="dict", sort=False): Extracts text content.
    • mode: Can be dict, rawdict, blocks, or text.
  • get_drawings(): Extracts vector drawings and graphics, mapping them to structured dictionaries containing rect, fill, and stroke properties.
  • get_pixmap(matrix=None, clip=None): Renders the page to a bitmap image (Pixmap). It attempts to render using the C++ core; if that fails or returns a placeholder, it falls back to the configured Python preview backend (PDFium or Playwright).
  • redact_text(rects, output_path, min_overlap_ratio=0.0): Applies redaction to the specified rectangles and saves the output to a new PDF file.
  • rect (Property): Retrieves the bounding box of the page as a Rect.

Pixmap

Represents an uncompressed image buffer containing pixel data.

Properties:

  • width, height: Dimensions in pixels.
  • n: Number of channels (e.g., 4 for RGBA).
  • stride: Number of bytes per row.
  • samples: Raw byte array of pixel data.

Methods:

  • pixel(x, y): Returns a tuple representing the pixel color at the specified coordinates.
  • tobytes(fmt="raw"): Encodes the pixmap to the requested format. Supported formats include raw, rgba, png, jpg, and jpeg. Output formats other than raw require the Pillow library.

Geometry Classes

  • Rect(x0, y0, x1, y1): Represents a rectangle. Provides properties for width, height, and is_empty. Overloads the & operator to compute the intersection of two rectangles.
  • Matrix(sx=1.0, sy=1.0): Represents a 2D scaling matrix.

Caching Strategy

The module implements file-based caching for document instances to minimize redundant initialization and file I/O operations.

  • Global Document Cache: Managed via open(path). Validates cache hits using file signature metrics (file size and modification time in nanoseconds).
  • Preview Document Cache: A separate caching layer (_open_preview_pdfium_doc) strictly for the pypdfium2 rendering backend to keep the preview document context alive across multiple page renders.

Dependencies

  • pypdfium2: Optional. Used for decryption and as the primary preview rendering backend.
  • Pillow (PIL): Optional. Required for encoding Pixmap instances to PNG/JPEG and manipulating preview images.
  • playwright: Optional. Used as a secondary headless browser rendering backend if PDFium is unavailable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

winnerz-1.0.3.tar.gz (37.1 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

winnerz-1.0.3-cp312-cp312-win_amd64.whl (14.2 kB view details)

Uploaded CPython 3.12Windows x86-64

winnerz-1.0.3-cp312-cp312-manylinux_2_28_x86_64.whl (7.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

winnerz-1.0.3-cp312-cp312-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

winnerz-1.0.3-cp312-cp312-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12macOS 10.9+ x86-64

winnerz-1.0.3-cp311-cp311-win_amd64.whl (14.2 kB view details)

Uploaded CPython 3.11Windows x86-64

winnerz-1.0.3-cp311-cp311-manylinux_2_28_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

winnerz-1.0.3-cp311-cp311-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

winnerz-1.0.3-cp311-cp311-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

winnerz-1.0.3-cp310-cp310-win_amd64.whl (14.2 kB view details)

Uploaded CPython 3.10Windows x86-64

winnerz-1.0.3-cp310-cp310-manylinux_2_28_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

winnerz-1.0.3-cp310-cp310-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

winnerz-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

winnerz-1.0.3-cp39-cp39-win_amd64.whl (14.2 kB view details)

Uploaded CPython 3.9Windows x86-64

winnerz-1.0.3-cp39-cp39-manylinux_2_28_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

winnerz-1.0.3-cp39-cp39-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

winnerz-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

File details

Details for the file winnerz-1.0.3.tar.gz.

File metadata

  • Download URL: winnerz-1.0.3.tar.gz
  • Upload date:
  • Size: 37.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.3.tar.gz
Algorithm Hash digest
SHA256 f5beb31efde790bfa478101f577e60c7609cd2e343fde78ceada0f9f032b881a
MD5 de8e0d4779fd41759612b898db9ff6ed
BLAKE2b-256 2fc52293d33033e10eef2094e630ec6c85f456b22c7adfe71281058e68500cc9

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 cd851d07b0a24308d2ddb4b787dbfddfd52396bc614c01594ac76d2e5d756989
MD5 37cc8ae25ce62862333c138b4b8a062e
BLAKE2b-256 76e930c5894bef381e017db717d8e848dfc3573d6cd8cb258d2b57a02ccc5e6a

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 26cec93f73a0659cf38dbe4eebcc7bfa7f59a88078575b03f0415b4e0798b90e
MD5 b540dab9d15cb6a0fd72ca160a8ad576
BLAKE2b-256 3ea10b814fc4bf7131e5f903bb62183399cc7b3771603d3194f8ffe4bbe29f0b

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 89718f591622922ac6c952b991ff1377edeb4091e668516a198eeea61c815dd1
MD5 3832f12627fa652539b7a397f80f7895
BLAKE2b-256 8be406ee2c6d6a8ff72bc418ab74078227e07519cf73bed09a36c3055136b677

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 190f3700af8cb758bb0a6662cdff39df0a63826de3afb2acd9b29a97f0dc4ce9
MD5 92f183adc0fbd20b6b8341ca06899aed
BLAKE2b-256 128c0216aa7c523ef3349b1f1277d7ad943b657fbba1c1763a82f8f17fcd47e8

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 4df259e06785be3547b1faa4a2d54630825e29adb511ab6a5f470520f4edabbb
MD5 37fddd2247f7730834aca80ef88dcd1b
BLAKE2b-256 cb54be36a75b45725ce5bfaa25bc6a5dcf86c8d8c36f515a90e25d9530a39e57

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1c5395cf52683617518f63824f1bc69b13b45c6a091b6a7730d907919cbcfea6
MD5 773eb704f9ac541db44d2590c25b769a
BLAKE2b-256 daf6c248f9cfd11e0fdee038bdfcce10b8c9b33efcefc059ac88a38630d3d38a

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 026585a29e8323beefc3096b1b85a47ab0ebe319826bbb8cac3fff295b40548e
MD5 b87d88c936154073a918ef5e89e44d30
BLAKE2b-256 aa6c59bf10c32bbc3fbef77472e8130bd79532ab9a9c37f53b952eb30bfcd090

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a792ec3e78b12d655435013472f9f013017d19de3745af2004e1376a5a38fe1f
MD5 aed36c7ad6b093da17b99f903776240e
BLAKE2b-256 1f2c6e4f2434b39151d8f5d3f87b8250b658522bcd52c291780dba7d1f3f3f59

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 994ac336dca86276602febea6b637337067206df4c09fbcaab9fb062a1b796e1
MD5 15ed1a850ac3a73775c06b92b237c0b5
BLAKE2b-256 9d388bc239dfd9f5d5090f85de777bff2db784b0547b62b052023f7ae87c69df

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 76e09596fa87c939f505291e922dc32a2ae08435362267f829a82f1a8a60dedc
MD5 1fc14576bb27ea615f6a4c7bc00afc7b
BLAKE2b-256 acf766633f4df089dcbcbb613c836289df83ca5797ae44753f0b747c7fe9a87a

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 05c743546578f5709e556ff6e7b4e63d6de6569c453926f980b0c400dfde9fd6
MD5 e1f3e9721610528b51ddc71c4e12a1c0
BLAKE2b-256 94820b00acaee637427f1ffe9138c1e1d784c1a117b4e1a159336c15938ab307

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7b7ccbfaf1682814490ae45d2e2e02e875e893a13212c185e2c65560368ce524
MD5 85d239d70bb5ff59157e9008dc1237cb
BLAKE2b-256 317ba222e9d8449e77f6f41647781ca0d791efed391a86f03fe6ea0279c3fecb

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3636f3cd1d8404d49e4c64bf8fdacf26b7e4450b6a5586675a2a492bcd49cb61
MD5 d83a8281d34cc12c5e938ee8de2c0244
BLAKE2b-256 6a72a221a3ca27cfedfe09d941e87042245710ee8ec306b7c39b21a8f01be49b

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ca17f6c90de5fdc21105da9a969d1059a5b9adb649594069a30f53a99ef2ecc6
MD5 74a9114cca446b6146ecd92f8ef3d4b0
BLAKE2b-256 120a70b903ff8d43c194350c1a3fc8bbdf704d925e6dd95dbeeb8d77c9843a0c

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9d5e2ca590f247db27026125dc15cafe26354937fa3f79daa00fe326b48faf37
MD5 24a299d56efa6d74900ca2ea7ed916dc
BLAKE2b-256 ae11d425c1b5a70c19e08a5a6cedfc376514ac8efa00b005a88558bab7d66310

See more details on using hashes here.

File details

Details for the file winnerz-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 57b8d1cdbbefecd5773b049320f0e891a14f8d4b40836cfb7249817419130f99
MD5 d58dfb2d86c87e397364238afd751aef
BLAKE2b-256 9cab52e2eb1aca3dee32742f565b61df833cae6e6506a4fc7d1255a5fc0839f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page