Skip to main content

End-to-end document widget detection pipeline using YOLO11 on CommonForms dataset

Project description

Widget Detector

End-to-end document form widget detection using YOLO11m, automatically downloading fine-tuned weights from Hugging Face.

Detects 3 classes of form fields from scanned PDFs and document images:

Class ID Name Description
0 text_input Text boxes, input lines
1 choice_button Checkboxes + radio buttons
2 signature Signature fields

Installation

You can install the package directly from PyPI:

pip install psynx-widget-detector

Requires Python 3.11+


Quickstart

The package will automatically download the fine-tuned YOLO11m weights from Hugging Face (PSynx/widget-detector-yolo) the first time you run it.

from widget_detector import WidgetDetector

# 1. Initialize the detector (downloads weights automatically if not found)
detector = WidgetDetector()

# 2. Run inference on a PDF (auto-renders pages to images)
result = detector.detect_path("sample_form.pdf")

# 3. Print the results
print(f"Detected {result.total_widgets} widgets across {result.total_pages} pages.")

for page in result.pages:
    print(f"\nPage {page.page}:")
    for widget in page.widgets:
        print(f" - {widget.class_name} ({widget.confidence:.2f}) at "
              f"[{widget.bbox.x1:.1f}, {widget.bbox.y1:.1f}, {widget.bbox.x2:.1f}, {widget.bbox.y2:.1f}]")

# 4. Save results to JSON
result.save("output.json")

Output Format

The detector returns a structured Pydantic object that cleanly serializes to JSON:

{
  "source": "form.pdf",
  "total_pages": 3,
  "total_widgets": 24,
  "pages": [
    {
      "source": "form.pdf",
      "page": 1,
      "image_width": 1654,
      "image_height": 2339,
      "processing_time_ms": 142.3,
      "widgets": [
        {
          "class_id": 0,
          "class_name": "text_input",
          "confidence": 0.913,
          "bbox": {
            "x1": 120.0, "y1": 340.0, "x2": 480.0, "y2": 380.0,
            "x1_norm": 0.073, "y1_norm": 0.145,
            "x2_norm": 0.290, "y2_norm": 0.163
          },
          "page": 1
        }
      ]
    }
  ]
}

Notes

  • CommonForms choice_button includes both checkboxes and radio buttons as one class (the dataset does not distinguish them).
  • Inference Speed: If you have a CUDA-enabled GPU, the WidgetDetector will automatically use it for highly accelerated inference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psynx_widget_detector-0.1.1.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psynx_widget_detector-0.1.1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file psynx_widget_detector-0.1.1.tar.gz.

File metadata

  • Download URL: psynx_widget_detector-0.1.1.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for psynx_widget_detector-0.1.1.tar.gz
Algorithm Hash digest
SHA256 10342b2d0f8d73403b3315b2b3226fb2ff08896f07fc7377f274ef8e777732c4
MD5 b1d24f6ada423ddb85883d15d79df25c
BLAKE2b-256 855809de1730135a98c243d8570fa311c668c74bed0f45b1777f3f2e0561239a

See more details on using hashes here.

File details

Details for the file psynx_widget_detector-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for psynx_widget_detector-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bf805f5af36de07a0b8a0de92490d7ca586563f3712d79b616a4f9637f032252
MD5 b46f68d5c989aa509ffd593115625dcb
BLAKE2b-256 5a1627fcc3dd031c52a3b549b6f3bc846646fb1c08a5b9a25b843843d2fe2a43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page