End-to-end document widget detection pipeline using YOLO11 on CommonForms dataset
Project description
Widget Detector
End-to-end document form widget detection using YOLO11m, automatically downloading fine-tuned weights from Hugging Face.
Detects 3 classes of form fields from scanned PDFs and document images:
| Class ID | Name | Description |
|---|---|---|
| 0 | text_input |
Text boxes, input lines |
| 1 | choice_button |
Checkboxes + radio buttons |
| 2 | signature |
Signature fields |
Installation
You can install the package directly from PyPI:
pip install psynx-widget-detector
Requires Python 3.11+
Quickstart
The package will automatically download the fine-tuned YOLO11m weights from Hugging Face (PSynx/widget-detector-yolo) the first time you run it.
from widget_detector import WidgetDetector
# 1. Initialize the detector (downloads weights automatically if not found)
detector = WidgetDetector()
# 2. Run inference on a PDF (auto-renders pages to images)
result = detector.detect_path("sample_form.pdf")
# 3. Print the results
print(f"Detected {result.total_widgets} widgets across {result.total_pages} pages.")
for page in result.pages:
print(f"\nPage {page.page}:")
for widget in page.widgets:
print(f" - {widget.class_name} ({widget.confidence:.2f}) at "
f"[{widget.bbox.x1:.1f}, {widget.bbox.y1:.1f}, {widget.bbox.x2:.1f}, {widget.bbox.y2:.1f}]")
# 4. Save results to JSON
result.save("output.json")
Output Format
The detector returns a structured Pydantic object that cleanly serializes to JSON:
{
"source": "form.pdf",
"total_pages": 3,
"total_widgets": 24,
"pages": [
{
"source": "form.pdf",
"page": 1,
"image_width": 1654,
"image_height": 2339,
"processing_time_ms": 142.3,
"widgets": [
{
"class_id": 0,
"class_name": "text_input",
"confidence": 0.913,
"bbox": {
"x1": 120.0, "y1": 340.0, "x2": 480.0, "y2": 380.0,
"x1_norm": 0.073, "y1_norm": 0.145,
"x2_norm": 0.290, "y2_norm": 0.163
},
"page": 1
}
]
}
]
}
Notes
- CommonForms
choice_buttonincludes both checkboxes and radio buttons as one class (the dataset does not distinguish them). - Inference Speed: If you have a CUDA-enabled GPU, the
WidgetDetectorwill automatically use it for highly accelerated inference.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file psynx_widget_detector-0.1.2.tar.gz.
File metadata
- Download URL: psynx_widget_detector-0.1.2.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20a8320107dfebf37adb5539e07079e4900777664ea022b50624660c9862d016
|
|
| MD5 |
482af385165bbdf218923b491c47e273
|
|
| BLAKE2b-256 |
3577d1a4a23e61e2eae259945fd16db810810073a93a1aa7498b89df44ad61c2
|
File details
Details for the file psynx_widget_detector-0.1.2-py3-none-any.whl.
File metadata
- Download URL: psynx_widget_detector-0.1.2-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4cfbd8e3c70bcb5a25818e930ac777624ff0bc946d7b63f34de048cf59a47767
|
|
| MD5 |
fd624473fa56af5e2fd4f90ad7188678
|
|
| BLAKE2b-256 |
913a188b816430916ca8e5ba8a28bf5fff91c64a72d9f48f526dfbcc1e26bf0b
|