Skip to main content

Populate fillable pdf forms from csv data file

Project description

pdfforms is a small utility for populating fillable pdf forms from a spreadsheet data source. It was created with the intent of filling US tax forms using tax data prepared with a spreadsheet, but should be equally applicable to other forms.

Features

  • Assigns numeric id for each field

  • Generates test pdf showing ids of text fields

  • Merges spreadsheet data into final filled pdf

  • Works with multiple spreadsheet formats

  • Can process multiple pdfs at a time

  • Can be used as a library or CLI

  • Optional rounding and number formatting

Requirements

pdfforms requires Python 3.5 or higher, pyexcel for data loading, and pdftk, which does all the real work.

Installation

To install: pip install pdfforms

Documentation

For complete documentation, see https://pdfforms.readthedocs.io/

Example

Let’s say you have a spreadsheet with your tax calculations. You want to populate your tax forms with the data from the spreadsheet. pdfforms allows you to do so with the following steps:

  1. First pdfforms must inspect the forms to be filled. pdfforms will extract a list of fields in each of the specified documents. Each field is assigned a numeric id, and test documents are generated with filled forms, showing the id of each text field:

    $ pdfforms inspect f1040*.pdf
    f1040sse.pdf
    f1040sce.pdf
    f1040.pdf

    The filled test pdfs are stored by default in the test/ subdirectory.

  2. Browse the test pdf files and add the field numbers of the fields you need to fill to your spreadsheet. pdfforms only reads the first and third columns of the datafile. The first column should contain the name of the pdf file with the form to fill and the field numbers. The third column should contain the data to be written into the field. The rest of the sheet is ignored, so you can use it for notes, calculations, etc.

    pdfforms is case sensitive! The file name in the spreadsheet must match exactly the name of the pdf to be filled.

    Below is an example spreadsheet for a (fictional) 2016 tax return.

    f1040.pdf

    Form 1040

    2016

    3

    First Name and initial

    John Q

    4

    Last Name

    Public

    5

    SSN

    321546789

    321-54-6789

    6

    Spouse’s Name

    Susie

    7

    Spouse’s Last Name

    Public

    8

    SSN

    132458697

    132-45-8697

    9

    Address

    5776 Winding Ln

    11

    Springfield, MA

    18

    Filing status

    MJ

    24

    Exemption - self

    1

    25

    Exemption - spouse

    1

    27

    Dependent name

    Timothy Public

    28

    Dependent ssn

    531248680

    531-24-8680

    29

    Dependent relationship

    Son

    30

    Dependent under 17

    1

    31

    Dependent name

    Abigail Public

    32

    Dependent ssn

    428775031

    428-77-5031

    33

    Dependent relationship

    Daughter

    34

    Dependent under 17

    1

    45

    Line 6a

    2

    46

    Line 6c

    2

    49

    Line 6d

    4

    50

    Line 7

    60,000

    salaries

    52

    Line 8a

    124

    taxable interest

    64

    Line 12

    15,000

    business income

    92

    Line 22

    75,124

    total income

    102

    Line 27

    1,060

    half SE tax

    121

    Line 36

    1,060

    123

    Line 37

    74,064

    Adjusted Gross Income

    125

    Line 38

    74,064

    133

    Line 40

    12,600

    Standard Deduction

    135

    Line 41

    61,464

    137

    Line 42

    16,200

    Exemptions

    $ 4,050

    139

    Line 43

    45,264

    Taxable income

    145

    Line 44

    4,528

    Tax

    151

    Line 47

    4,528

    161

    Line 52

    2,000

    Child Tax Credit

    171

    Line 55

    2,000

    Total Credits

    173

    Line 56

    2,528

    175

    Line 57

    2,119

    Self-employment tax

    196

    Line 63

    4,647

    Total Tax

    198

    Line 64

    8,688

    Tax withheld

    225

    Line 74

    8,688

    Total Payments

    227

    Line 75

    4,041

    Amount you overpaid

    230

    Line 76a

    4,041

    Amount you want refunded

    232

    Line 76b

    123654789

    Routing Number

    234

    Line 76c

    Savings

    Account Type

    235

    Line 76d

    135724

    Account Number

    247

    Occupation

    Salesman

    248

    Daytime phone number

    413-555-1212

    249

    Spouse’s Occupation

    Artist

    f1040sce.pdf

    Schedule C-EZ

    0

    Name

    Susie Public

    1

    SSN

    132-45-8697

    9

    Line F

    2

    No

    2

    Line A

    Artist

    Principle business or profession

    3

    Line B

    711510

    Business Code

    13

    Line 1

    22,000

    gross receipts

    15

    Line 2

    7,000

    total expenses

    17

    Line 3

    15,000

    net profit

    f1040sse.pdf

    Form SE - Section A Short Schedule SE

    0

    Name

    Susie Public

    1

    SSN

    132-45-8697

    6

    Line 2

    15,000

    8

    Line 3

    15,000

    92.35%

    10

    Line 4

    13,853

    15.30%

    12

    Line 5

    2,119

    50.00%

    14

    Line 6

    1,060

    The test pdfs do not show field numbers for checkboxes. Currently the only way to fill checkboxes is to examine the fields.json file and find the field number and allowed values of the checkbox.

  3. Once the file name and field numbers have been added to your spreadsheet, save the spreadsheet as a csv file and fill the forms:

    $ pdfforms fill mydata.csv
    f1040sse.pdf
    f1040sce.pdf
    f1040.pdf

    The final, populated pdf files are saved by default to the filled/ subdirectory.

Changelog

2.0.0

date:

15 Aug, 2021

  • Use pyexcel to load spreadsheet data, supports xlsx, ods, csv, and more

  • Add options to round values, add thousands separators

  • Split codebase up and publish an API

  • Make .pdf suffix recognition case-insensitive

  • Better handling of invalid input

  • Expanded documentation

  • General code clean-up, refactoring, linting, and reformatting

1.2.1

date:

3 July, 2020

  • Don’t crash when subcommand not supplied (Thanks @PiDelport for the PR)

1.2.0

date:

24 September, 2019

  • Added --no-flatten option to keep form fillable

  • inspect doesn’t crash if passed a pdf without fillable form

1.1.0

date:

4 July, 2018

  • Fixed handling of whitespace (Thanks @rohitkhirapate for the bug report)

  • Added python 3.4 compatibility (Thanks @oneyb for the PR)

1.0.0

date:

1 May, 2017

  • Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfforms-2.0.0.tar.gz (35.7 kB view details)

Uploaded Source

Built Distribution

pdfforms-2.0.0-py2.py3-none-any.whl (12.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pdfforms-2.0.0.tar.gz.

File metadata

  • Download URL: pdfforms-2.0.0.tar.gz
  • Upload date:
  • Size: 35.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.2

File hashes

Hashes for pdfforms-2.0.0.tar.gz
Algorithm Hash digest
SHA256 715d8ceb5a62d525eca569a8f667e7c1d92d0da4455f9803bed551cfa93876c9
MD5 54960238e2a902ad75694dba2edc5c83
BLAKE2b-256 cbf8816bb9a0c682be0406afc33956345950d3b7c8fe074d72d6b9f55413dc6c

See more details on using hashes here.

File details

Details for the file pdfforms-2.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: pdfforms-2.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.2

File hashes

Hashes for pdfforms-2.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b917ea1d493422584e719cac45309d847c3f89edc86e85eba4a59e752bbaa69e
MD5 524c88892bf896e5c22f7a3ea81ef369
BLAKE2b-256 6c627cba41a3ef1174576884e7ed69ea6f6f3238189b3b8ba69ef9a5378614be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page