Skip to main content

End-to-end document widget detection pipeline using YOLO11 on CommonForms dataset

Project description

Widget Detector

End-to-end document form widget detection using YOLO11m, automatically downloading fine-tuned weights from Hugging Face.

Detects 3 classes of form fields from scanned PDFs and document images:

Class ID Name Description
0 text_input Text boxes, input lines
1 choice_button Checkboxes + radio buttons
2 signature Signature fields

Installation

You can install the package directly from PyPI:

pip install psynx-widget-detector

Requires Python 3.11+


Quickstart

The package will automatically download the fine-tuned YOLO11m weights from Hugging Face (PSynx/widget-detector-yolo) the first time you run it.

from widget_detector import WidgetDetector

# 1. Initialize the detector (downloads weights automatically if not found)
detector = WidgetDetector()

# 2. Run inference on a PDF (auto-renders pages to images)
result = detector.detect_path("sample_form.pdf")

# 3. Print the results
print(f"Detected {result.total_widgets} widgets across {result.total_pages} pages.")

for page in result.pages:
    print(f"\nPage {page.page}:")
    for widget in page.widgets:
        print(f" - {widget.class_name} ({widget.confidence:.2f}) at "
              f"[{widget.bbox.x1:.1f}, {widget.bbox.y1:.1f}, {widget.bbox.x2:.1f}, {widget.bbox.y2:.1f}]")

# 4. Save results to JSON
result.save("output.json")

Output Format

The detector returns a structured Pydantic object that cleanly serializes to JSON:

{
  "source": "form.pdf",
  "total_pages": 3,
  "total_widgets": 24,
  "pages": [
    {
      "source": "form.pdf",
      "page": 1,
      "image_width": 1654,
      "image_height": 2339,
      "processing_time_ms": 142.3,
      "widgets": [
        {
          "class_id": 0,
          "class_name": "text_input",
          "confidence": 0.913,
          "bbox": {
            "x1": 120.0, "y1": 340.0, "x2": 480.0, "y2": 380.0,
            "x1_norm": 0.073, "y1_norm": 0.145,
            "x2_norm": 0.290, "y2_norm": 0.163
          },
          "page": 1
        }
      ]
    }
  ]
}

Notes

  • CommonForms choice_button includes both checkboxes and radio buttons as one class (the dataset does not distinguish them).
  • Inference Speed: If you have a CUDA-enabled GPU, the WidgetDetector will automatically use it for highly accelerated inference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psynx_widget_detector-0.1.2.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psynx_widget_detector-0.1.2-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file psynx_widget_detector-0.1.2.tar.gz.

File metadata

  • Download URL: psynx_widget_detector-0.1.2.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for psynx_widget_detector-0.1.2.tar.gz
Algorithm Hash digest
SHA256 20a8320107dfebf37adb5539e07079e4900777664ea022b50624660c9862d016
MD5 482af385165bbdf218923b491c47e273
BLAKE2b-256 3577d1a4a23e61e2eae259945fd16db810810073a93a1aa7498b89df44ad61c2

See more details on using hashes here.

File details

Details for the file psynx_widget_detector-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for psynx_widget_detector-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4cfbd8e3c70bcb5a25818e930ac777624ff0bc946d7b63f34de048cf59a47767
MD5 fd624473fa56af5e2fd4f90ad7188678
BLAKE2b-256 913a188b816430916ca8e5ba8a28bf5fff91c64a72d9f48f526dfbcc1e26bf0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page