Skip to main content

A robust Python wrapper for high-performance PDF rendering and text extraction using C++ core.

Project description

WinnerZ Python Library Documentation

Overview

The winnerz library is a robust Python wrapper designed for processing, rendering, and manipulating PDF documents. It relies on a high-performance C++ core extension (winnerz_core) for intensive operations while providing seamless fallback mechanisms and caching strategies in Python.

The architecture emphasizes reliability and fault-tolerance, specifically in handling binary dependencies, dynamic core library loading, and flexible preview rendering backends (PDFium and Playwright).

Architecture

The system is divided into several conceptual layers:

  1. Core Loader & Diagnostics: Handles the dynamic importing of the C++ binary (winnerz_core), including binary size verification, truncation repair, and Windows DLL directory management.
  2. Document Object Model: Provides Pythonic abstractions (Document, Page) to interact with PDF files, managing resources and state safely.
  3. Rendering Pipeline: Integrates the C++ rendering engine with fallback Python-based preview engines using pypdfium2 or playwright.
  4. Geometry & Data Structures: Implements domain-specific types (Rect, Matrix, Pixmap) to standardize data flow between the C++ layer and Python runtime.

Core Loading Mechanism

The library initializes the C++ binary through _load_core(). This system provides the following safety guarantees:

  • Thread Safety: Uses threading.Lock() to ensure the core is initialized exactly once.
  • Retry Logic: Implements a retry loop (_CORE_IMPORT_RETRIES = 3) to mitigate transient filesystem or OS-level loading issues.
  • Self-Healing: If a truncated binary is detected (e.g., due to an interrupted build or copy), _try_repair_truncated_core_binary() attempts to restore it from other valid candidate binaries in the directory.
  • Diagnostic Reporting: Generates detailed error messages specifying binary ABI mismatches (e.g., GLIBC mismatches) or binary sizes to accelerate debugging.

Environment Variables

  • WINNERZ_PREVIEW_BACKEND: Controls the backend used for rendering preview data when the C++ core returns placeholder data.
    • Valid values: auto (default), pdfium, playwright.
    • Resolution order for auto: Falls back from PDFium to Playwright based on availability.

Class Reference

Document

Represents a PDF document instance. It manages the lifecycle of the underlying file, including automatic decryption via a temporary file using PDFium if the document is encrypted.

Constructor:

  • Document(path): Resolves the path, checks for encryption, and determines the page count via the C++ core.

Methods:

  • __getitem__(index): Retrieves a Page object at the specified 0-based index. Supports negative indexing.
  • __len__(): Returns the total number of pages in the document.
  • close(): Cleans up temporary resources, such as decrypted temporary files.

Page

Represents a single page within a Document.

Methods:

  • get_text(mode="dict", sort=False): Extracts text content.
    • mode: Can be dict, rawdict, blocks, or text.
  • get_drawings(): Extracts vector drawings and graphics, mapping them to structured dictionaries containing rect, fill, and stroke properties.
  • get_pixmap(matrix=None, clip=None): Renders the page to a bitmap image (Pixmap). It attempts to render using the C++ core; if that fails or returns a placeholder, it falls back to the configured Python preview backend (PDFium or Playwright).
  • redact_text(rects, output_path, min_overlap_ratio=0.0): Applies redaction to the specified rectangles and saves the output to a new PDF file.
  • rect (Property): Retrieves the bounding box of the page as a Rect.

Pixmap

Represents an uncompressed image buffer containing pixel data.

Properties:

  • width, height: Dimensions in pixels.
  • n: Number of channels (e.g., 4 for RGBA).
  • stride: Number of bytes per row.
  • samples: Raw byte array of pixel data.

Methods:

  • pixel(x, y): Returns a tuple representing the pixel color at the specified coordinates.
  • tobytes(fmt="raw"): Encodes the pixmap to the requested format. Supported formats include raw, rgba, png, jpg, and jpeg. Output formats other than raw require the Pillow library.

Geometry Classes

  • Rect(x0, y0, x1, y1): Represents a rectangle. Provides properties for width, height, and is_empty. Overloads the & operator to compute the intersection of two rectangles.
  • Matrix(sx=1.0, sy=1.0): Represents a 2D scaling matrix.

Caching Strategy

The module implements file-based caching for document instances to minimize redundant initialization and file I/O operations.

  • Global Document Cache: Managed via open(path). Validates cache hits using file signature metrics (file size and modification time in nanoseconds).
  • Preview Document Cache: A separate caching layer (_open_preview_pdfium_doc) strictly for the pypdfium2 rendering backend to keep the preview document context alive across multiple page renders.

Dependencies

  • pypdfium2: Optional. Used for decryption and as the primary preview rendering backend.
  • Pillow (PIL): Optional. Required for encoding Pixmap instances to PNG/JPEG and manipulating preview images.
  • playwright: Optional. Used as a secondary headless browser rendering backend if PDFium is unavailable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

winnerz-1.0.5.tar.gz (37.1 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

winnerz-1.0.5-cp312-cp312-win_amd64.whl (14.3 kB view details)

Uploaded CPython 3.12Windows x86-64

winnerz-1.0.5-cp312-cp312-manylinux_2_28_x86_64.whl (7.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

winnerz-1.0.5-cp312-cp312-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

winnerz-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12macOS 10.9+ x86-64

winnerz-1.0.5-cp311-cp311-win_amd64.whl (14.3 kB view details)

Uploaded CPython 3.11Windows x86-64

winnerz-1.0.5-cp311-cp311-manylinux_2_28_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

winnerz-1.0.5-cp311-cp311-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

winnerz-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

winnerz-1.0.5-cp310-cp310-win_amd64.whl (14.3 kB view details)

Uploaded CPython 3.10Windows x86-64

winnerz-1.0.5-cp310-cp310-manylinux_2_28_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

winnerz-1.0.5-cp310-cp310-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

winnerz-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

winnerz-1.0.5-cp39-cp39-win_amd64.whl (14.3 kB view details)

Uploaded CPython 3.9Windows x86-64

winnerz-1.0.5-cp39-cp39-manylinux_2_28_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

winnerz-1.0.5-cp39-cp39-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

winnerz-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

File details

Details for the file winnerz-1.0.5.tar.gz.

File metadata

  • Download URL: winnerz-1.0.5.tar.gz
  • Upload date:
  • Size: 37.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.5.tar.gz
Algorithm Hash digest
SHA256 d951678b35bb7c1fdf95e9388f8693ccdecf2acb3f4875a5c4bad1117759ae74
MD5 c1a056b057a4b0447ee8b6ad8354237f
BLAKE2b-256 1889c03fd43beb56a980e8dcda3564068fc3252d1dd050ea08136e328b95cdc4

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.5-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.5-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 94c2c45bc1836e9b9e44139f7d46544b8369c861e9ce4d0edde8f51d37cb862c
MD5 bb31c1fef249e95c7a64f4f52a07b02f
BLAKE2b-256 a71c392bbd14c6ef8bd78964a11ee02a1641a96e29ab3442f040a3bea2a441b3

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3ca966c2dddb5da99bb20c43de0dad36a15d42755f8ef3c7eb9399b7a43ee2ea
MD5 35caca0a7fcdb3e043e889421a1f7f30
BLAKE2b-256 1ea4b23eaac64f67094cba75733787416acdcc1878b62d900cb4e530833ce7a5

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e52993cf2f43bd99197570c4d4dd1227a7576827f29ee739cd082baeada6d0b7
MD5 3163f2c02b20fabd8674fd1e5fce523f
BLAKE2b-256 ed6d65e6c318264c83efefe07cb05801a72898608ee577707c8c17704a4cb963

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f057cba1648d25a819b109e5ff68968682f65e14a1ab2e5a2218b25d4b2af5d3
MD5 22920ce7210cd0a35630587ab2fd51e4
BLAKE2b-256 c22dfd5bec46fed8c8530e60e9b9e3bccbb0d533373bf22094321d43755694f0

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.5-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.5-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 109496422044b630f25733c102175cb8a1de2d728d4f7cab03216127c0b7b713
MD5 c9a5708c75ba4345cd0055fce92cb1ca
BLAKE2b-256 095dd67c662fda4d2eb9964f67cbd4a0b6bab0d6edbe5edb0193e796a8cc850b

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 61e25e888761fdb784acffc4d86cefa3725fa60d71f634d70a652b3ce13a91b0
MD5 8ece379a554b92ceb1a313c231c3def3
BLAKE2b-256 e55ec499ef701425532c325b99dc70b1b9d237c67bce4976516378f797c53e1e

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 acbb373ca8aa4478bbc7eba5351248c8e3b49b7650d3b41d59186be058284a75
MD5 f24e8bd2237b1ad0454a3710c9c00fc7
BLAKE2b-256 40872880e6aab5bc22849139c1e3f890271c5e2557feeef52346ee166a74bd54

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4daeabe4f57a905d2ecc116dabd33210e1c6e848344e471e61c36dddfbaba50a
MD5 cdbb9ac0864b43721ecaa505a9b93b53
BLAKE2b-256 dbb6c0f6e84a1be3e8888853bd2453cee2467ae5b94f1b28468a9a55b687a507

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.5-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6b229bd00aa2b35fdefa5767b5693f3ddc5bd4bc27055ca33efba3b364675ee7
MD5 8e5a43b8c2aa77ec47e7dc8b344640fd
BLAKE2b-256 2231a11af07e82b57df82d48faf58933f0d0d4f63d97965b8c85bf22ec3dafe4

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f3ab2035ac2b167ce5bc4026dab597daba29efbadb6a1faa3e8a2bc999720307
MD5 42cd1de97f7fecc90ea362aafb5c4674
BLAKE2b-256 61a7fc48c0fa47ae5d9bf3d4ea725b625c60db2196a5366b4e44312f9d1a5cfe

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 47864de01943289f4b404c232b099939100afcc709e1edecfb6d3876f6da493e
MD5 e5cb0c474e760821ba7f2f35f162059d
BLAKE2b-256 d81b8c6c90caca1c6ae93023aed32207fc05888d0611b094d0a62633f4adda28

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7e55720e766a88d8fe519813dbe5f7dbdc01931f1d408fc25d76c354fb2eb0d2
MD5 3e585ca4866ff35af9fe23b774825341
BLAKE2b-256 48ecc06a750e53647794db8a48f51fc974afc80e8a3a35420b9fe8cc2359f5fc

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: winnerz-1.0.5-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnerz-1.0.5-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f55ade03e9785a0eb93ddf94be29f8b707f17843716e458b7e6777979653438a
MD5 094b9c4713969ea87aa557f34cecb709
BLAKE2b-256 6647ee5441881eebfce1c9ad398e0ca03e945315d4016bc7bff6bc67f879e31c

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 be690e810cbc026d532915c8d622cb0af244fdbea9251b0885b202bb3bd898f5
MD5 4621bda49a1b14a30268be030025ad7d
BLAKE2b-256 e9a47fc474510d6f2dba5550a32bde4c39cec1cee537d71c8fee3c82284d6bd0

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 59bb70d8546405e3d447a90a49b592c54e0135b5bf685175806dab6992b0065f
MD5 3caaa44697b5abe73cf8b51e4dbb002c
BLAKE2b-256 a8a0bb888eec5e5e25aeaec8f1724588482d80835bb8a1c71a68c478488f38b8

See more details on using hashes here.

File details

Details for the file winnerz-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for winnerz-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 afb2af269485c369d42b376ffb4de81e6aeb1580b0011427fa1781f3e5e783a9
MD5 065d899c7bf5f839b642e3b46fd23973
BLAKE2b-256 70a2b1074e603e50f767fc3b6d50620b53a07c08393e69df69573e575dee597d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page