Automatically convert a PDF into a fillable form
Project description
CommonForms
🪄 Automatically convert a PDF into a fillable form.
💻 Hosted Models (detect.semanticdocs.org) | 📄 CommonForms Paper | 🤗 Dataset | 🦾 Models
This repo contains three things:
- the pip-installable
commonformspackage, which has a CLI and API for converting PDFs into fillable forms - the FFDNet-S and FFDNet-L models from the paper CommonForms: A Large, Diverse Dataset for Form Field Detection
- the preprocessing code for the CommonForms dataset, which is hosted on HuggingFace: https://huggingface.co/datasets/jbarrow/CommonForms
Installation
CommonForms can be installed with either uv or pip, feel free to choose your package manager flavor:
uv pip install commonforms
Once it's installed, you should be able to run the CLI command on ~any PDF.
CommonForms CLI
The simplest usage will run inference on your CPU using the default suggested settings:
commonforms <input.pdf> <output.pdf>
| Input | Output |
|---|---|
Command Line Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
input |
Path | Required | Path to the input PDF file |
output |
Path | Required | Path to save the output PDF file |
--model |
str | FFDNet-L |
Model name (FFDNet-L/FFDNet-S) or path to custom .pt file |
--keep-existing-fields |
flag | False |
Keep existing form fields in the PDF |
--use-signature-fields |
flag | False |
Use signature fields instead of text fields for detected signatures |
--device |
str | cpu |
Device for inference (e.g., cpu, cuda, 0) |
--image-size |
int | 1600 |
Image size for inference |
--confidence |
float | 0.3 |
Confidence threshold for detection |
--fast |
flag | False |
If running on a CPU, you can trade off accuracy for speed and run in about half the time |
CommonForms API
In addition to the CLI, you can use
from commonforms import prepare_form
prepare_form(
"path/to/input.pdf",
"path/to/output.pdf"
)
All of the above arguments are keyword arguments to the prepare_form function.
Dataset Prep
🚧 Code for dataset prep exists in the dataset folder.
Citation
If you use the tool, models, or code in an academic paper, please cite the CommonForms paper:
@misc{barrow2025commonforms,
title = {CommonForms: A Large, Diverse Dataset for Form Field Detection},
author = {Barrow, Joe},
year = {2025},
eprint = {2509.16506},
archivePrefix= {arXiv},
primaryClass = {cs.CV},
doi = {10.48550/arXiv.2509.16506},
url = {https://arxiv.org/abs/2509.16506}
}
If you use it in a non-academic setting, please reach out to the author (joseph.d.barrow [at] gmail.com)! I love to hear when people are using my work!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file commonforms-0.1.5.tar.gz.
File metadata
- Download URL: commonforms-0.1.5.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5396f5a3b69056638ad716fd7dd478091b03bd1768524d95d78c4d88de27415e
|
|
| MD5 |
c7b01ac585986e3ed46b5366e2e6f3c0
|
|
| BLAKE2b-256 |
5faa8c6a213ae9df3dc90d9346a10539ba3c7ccc6630a0d213ed19abca59cd06
|
File details
Details for the file commonforms-0.1.5-py3-none-any.whl.
File metadata
- Download URL: commonforms-0.1.5-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8127d4ab1301c263a90a72819b274e0568bd2d8e6918e724a87b2b47e3ee0a43
|
|
| MD5 |
937e35486c57ab99fd80696e17172782
|
|
| BLAKE2b-256 |
c1b869648ae43ae27666af1352d2adfcd123c67a7af7144dd39cfaff845f544d
|