A robust Python wrapper for high-performance PDF rendering and text extraction using C++ core.
Project description
WinnerZ Python Library Documentation
Overview
The winnerz library is a robust Python wrapper designed for processing, rendering, and manipulating PDF documents. It relies on a high-performance C++ core extension (winnerz_core) for intensive operations while providing seamless fallback mechanisms and caching strategies in Python.
The architecture emphasizes reliability and fault-tolerance, specifically in handling binary dependencies, dynamic core library loading, and flexible preview rendering backends (PDFium and Playwright).
Architecture
The system is divided into several conceptual layers:
- Core Loader & Diagnostics: Handles the dynamic importing of the C++ binary (
winnerz_core), including binary size verification, truncation repair, and Windows DLL directory management. - Document Object Model: Provides Pythonic abstractions (
Document,Page) to interact with PDF files, managing resources and state safely. - Rendering Pipeline: Integrates the C++ rendering engine with fallback Python-based preview engines using
pypdfium2orplaywright. - Geometry & Data Structures: Implements domain-specific types (
Rect,Matrix,Pixmap) to standardize data flow between the C++ layer and Python runtime.
Core Loading Mechanism
The library initializes the C++ binary through _load_core(). This system provides the following safety guarantees:
- Thread Safety: Uses
threading.Lock()to ensure the core is initialized exactly once. - Retry Logic: Implements a retry loop (
_CORE_IMPORT_RETRIES = 3) to mitigate transient filesystem or OS-level loading issues. - Self-Healing: If a truncated binary is detected (e.g., due to an interrupted build or copy),
_try_repair_truncated_core_binary()attempts to restore it from other valid candidate binaries in the directory. - Diagnostic Reporting: Generates detailed error messages specifying binary ABI mismatches (e.g.,
GLIBCmismatches) or binary sizes to accelerate debugging.
Environment Variables
WINNERZ_PREVIEW_BACKEND: Controls the backend used for rendering preview data when the C++ core returns placeholder data.- Valid values:
auto(default),pdfium,playwright. - Resolution order for
auto: Falls back from PDFium to Playwright based on availability.
- Valid values:
Class Reference
Document
Represents a PDF document instance. It manages the lifecycle of the underlying file, including automatic decryption via a temporary file using PDFium if the document is encrypted.
Constructor:
Document(path): Resolves the path, checks for encryption, and determines the page count via the C++ core.
Methods:
__getitem__(index): Retrieves aPageobject at the specified 0-based index. Supports negative indexing.__len__(): Returns the total number of pages in the document.close(): Cleans up temporary resources, such as decrypted temporary files.
Page
Represents a single page within a Document.
Methods:
get_text(mode="dict", sort=False): Extracts text content.mode: Can bedict,rawdict,blocks, ortext.
get_drawings(): Extracts vector drawings and graphics, mapping them to structured dictionaries containingrect,fill, andstrokeproperties.get_pixmap(matrix=None, clip=None): Renders the page to a bitmap image (Pixmap). It attempts to render using the C++ core; if that fails or returns a placeholder, it falls back to the configured Python preview backend (PDFium or Playwright).redact_text(rects, output_path, min_overlap_ratio=0.0): Applies redaction to the specified rectangles and saves the output to a new PDF file.rect(Property): Retrieves the bounding box of the page as aRect.
Pixmap
Represents an uncompressed image buffer containing pixel data.
Properties:
width,height: Dimensions in pixels.n: Number of channels (e.g., 4 for RGBA).stride: Number of bytes per row.samples: Raw byte array of pixel data.
Methods:
pixel(x, y): Returns a tuple representing the pixel color at the specified coordinates.tobytes(fmt="raw"): Encodes the pixmap to the requested format. Supported formats includeraw,rgba,png,jpg, andjpeg. Output formats other than raw require thePillowlibrary.
Geometry Classes
- Rect(x0, y0, x1, y1): Represents a rectangle. Provides properties for
width,height, andis_empty. Overloads the&operator to compute the intersection of two rectangles. - Matrix(sx=1.0, sy=1.0): Represents a 2D scaling matrix.
Caching Strategy
The module implements file-based caching for document instances to minimize redundant initialization and file I/O operations.
- Global Document Cache: Managed via
open(path). Validates cache hits using file signature metrics (file size and modification time in nanoseconds). - Preview Document Cache: A separate caching layer (
_open_preview_pdfium_doc) strictly for thepypdfium2rendering backend to keep the preview document context alive across multiple page renders.
Dependencies
pypdfium2: Optional. Used for decryption and as the primary preview rendering backend.Pillow(PIL): Optional. Required for encodingPixmapinstances to PNG/JPEG and manipulating preview images.playwright: Optional. Used as a secondary headless browser rendering backend if PDFium is unavailable.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file winnerz-1.0.5.tar.gz.
File metadata
- Download URL: winnerz-1.0.5.tar.gz
- Upload date:
- Size: 37.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d951678b35bb7c1fdf95e9388f8693ccdecf2acb3f4875a5c4bad1117759ae74
|
|
| MD5 |
c1a056b057a4b0447ee8b6ad8354237f
|
|
| BLAKE2b-256 |
1889c03fd43beb56a980e8dcda3564068fc3252d1dd050ea08136e328b95cdc4
|
File details
Details for the file winnerz-1.0.5-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 14.3 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94c2c45bc1836e9b9e44139f7d46544b8369c861e9ce4d0edde8f51d37cb862c
|
|
| MD5 |
bb31c1fef249e95c7a64f4f52a07b02f
|
|
| BLAKE2b-256 |
a71c392bbd14c6ef8bd78964a11ee02a1641a96e29ab3442f040a3bea2a441b3
|
File details
Details for the file winnerz-1.0.5-cp312-cp312-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp312-cp312-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 7.3 MB
- Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ca966c2dddb5da99bb20c43de0dad36a15d42755f8ef3c7eb9399b7a43ee2ea
|
|
| MD5 |
35caca0a7fcdb3e043e889421a1f7f30
|
|
| BLAKE2b-256 |
1ea4b23eaac64f67094cba75733787416acdcc1878b62d900cb4e530833ce7a5
|
File details
Details for the file winnerz-1.0.5-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e52993cf2f43bd99197570c4d4dd1227a7576827f29ee739cd082baeada6d0b7
|
|
| MD5 |
3163f2c02b20fabd8674fd1e5fce523f
|
|
| BLAKE2b-256 |
ed6d65e6c318264c83efefe07cb05801a72898608ee577707c8c17704a4cb963
|
File details
Details for the file winnerz-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.12, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f057cba1648d25a819b109e5ff68968682f65e14a1ab2e5a2218b25d4b2af5d3
|
|
| MD5 |
22920ce7210cd0a35630587ab2fd51e4
|
|
| BLAKE2b-256 |
c22dfd5bec46fed8c8530e60e9b9e3bccbb0d533373bf22094321d43755694f0
|
File details
Details for the file winnerz-1.0.5-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 14.3 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
109496422044b630f25733c102175cb8a1de2d728d4f7cab03216127c0b7b713
|
|
| MD5 |
c9a5708c75ba4345cd0055fce92cb1ca
|
|
| BLAKE2b-256 |
095dd67c662fda4d2eb9964f67cbd4a0b6bab0d6edbe5edb0193e796a8cc850b
|
File details
Details for the file winnerz-1.0.5-cp311-cp311-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp311-cp311-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 6.3 MB
- Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61e25e888761fdb784acffc4d86cefa3725fa60d71f634d70a652b3ce13a91b0
|
|
| MD5 |
8ece379a554b92ceb1a313c231c3def3
|
|
| BLAKE2b-256 |
e55ec499ef701425532c325b99dc70b1b9d237c67bce4976516378f797c53e1e
|
File details
Details for the file winnerz-1.0.5-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acbb373ca8aa4478bbc7eba5351248c8e3b49b7650d3b41d59186be058284a75
|
|
| MD5 |
f24e8bd2237b1ad0454a3710c9c00fc7
|
|
| BLAKE2b-256 |
40872880e6aab5bc22849139c1e3f890271c5e2557feeef52346ee166a74bd54
|
File details
Details for the file winnerz-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4daeabe4f57a905d2ecc116dabd33210e1c6e848344e471e61c36dddfbaba50a
|
|
| MD5 |
cdbb9ac0864b43721ecaa505a9b93b53
|
|
| BLAKE2b-256 |
dbb6c0f6e84a1be3e8888853bd2453cee2467ae5b94f1b28468a9a55b687a507
|
File details
Details for the file winnerz-1.0.5-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 14.3 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b229bd00aa2b35fdefa5767b5693f3ddc5bd4bc27055ca33efba3b364675ee7
|
|
| MD5 |
8e5a43b8c2aa77ec47e7dc8b344640fd
|
|
| BLAKE2b-256 |
2231a11af07e82b57df82d48faf58933f0d0d4f63d97965b8c85bf22ec3dafe4
|
File details
Details for the file winnerz-1.0.5-cp310-cp310-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp310-cp310-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.4 MB
- Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3ab2035ac2b167ce5bc4026dab597daba29efbadb6a1faa3e8a2bc999720307
|
|
| MD5 |
42cd1de97f7fecc90ea362aafb5c4674
|
|
| BLAKE2b-256 |
61a7fc48c0fa47ae5d9bf3d4ea725b625c60db2196a5366b4e44312f9d1a5cfe
|
File details
Details for the file winnerz-1.0.5-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47864de01943289f4b404c232b099939100afcc709e1edecfb6d3876f6da493e
|
|
| MD5 |
e5cb0c474e760821ba7f2f35f162059d
|
|
| BLAKE2b-256 |
d81b8c6c90caca1c6ae93023aed32207fc05888d0611b094d0a62633f4adda28
|
File details
Details for the file winnerz-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e55720e766a88d8fe519813dbe5f7dbdc01931f1d408fc25d76c354fb2eb0d2
|
|
| MD5 |
3e585ca4866ff35af9fe23b774825341
|
|
| BLAKE2b-256 |
48ecc06a750e53647794db8a48f51fc974afc80e8a3a35420b9fe8cc2359f5fc
|
File details
Details for the file winnerz-1.0.5-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 14.3 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f55ade03e9785a0eb93ddf94be29f8b707f17843716e458b7e6777979653438a
|
|
| MD5 |
094b9c4713969ea87aa557f34cecb709
|
|
| BLAKE2b-256 |
6647ee5441881eebfce1c9ad398e0ca03e945315d4016bc7bff6bc67f879e31c
|
File details
Details for the file winnerz-1.0.5-cp39-cp39-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp39-cp39-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 4.4 MB
- Tags: CPython 3.9, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be690e810cbc026d532915c8d622cb0af244fdbea9251b0885b202bb3bd898f5
|
|
| MD5 |
4621bda49a1b14a30268be030025ad7d
|
|
| BLAKE2b-256 |
e9a47fc474510d6f2dba5550a32bde4c39cec1cee537d71c8fee3c82284d6bd0
|
File details
Details for the file winnerz-1.0.5-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59bb70d8546405e3d447a90a49b592c54e0135b5bf685175806dab6992b0065f
|
|
| MD5 |
3caaa44697b5abe73cf8b51e4dbb002c
|
|
| BLAKE2b-256 |
a8a0bb888eec5e5e25aeaec8f1724588482d80835bb8a1c71a68c478488f38b8
|
File details
Details for the file winnerz-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl.
File metadata
- Download URL: winnerz-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afb2af269485c369d42b376ffb4de81e6aeb1580b0011427fa1781f3e5e783a9
|
|
| MD5 |
065d899c7bf5f839b642e3b46fd23973
|
|
| BLAKE2b-256 |
70a2b1074e603e50f767fc3b6d50620b53a07c08393e69df69573e575dee597d
|