fecfile

a python parser for the .fec file format

These details have not been verified by PyPI

Project links

Project description

This is a library for converting campaign finance filings stored in the .fec format into native python objects. It maps the comma/ASCII 28 delimited fields to canonical names based on the version the filing uses and then converts the values that are dates and numbers into the appropriate int, float, or datetime objects.

This library is in relatively early testing. I've used it on a couple of projects, but I wouldn't trust it to work on all filings. That said, if you do try using it, I'd love to hear about it!

Why?

The FEC makes a ton of data available via the "export" links on the main site and the developer API. For cases where those data sources are sufficient, they are almost certainly the easiest/best way to go. A few cases where one might need to be digging into raw filings are:

Getting information from individual itemizations including addresses
Getting data as soon as it has been filed, instead of waiting for it to be coded. (The FEC generally codes all filings received by 7pm eastern by 7am the next day. However, that means that a filing received at 11:59pm on Monday wouldn't be available until 7am on Wednesday, for example.)
Getting more data than the rate-limit on the developer API would allow
Maintaining ones own database with all relevant campaign finance data, perhaps synced with another data source

Raw filings can be found by either downloading the bulk data zip files or from http requests like this. This library includes helper methods for both.

Installation

To get started, install from pypi by running the following command in your preferred terminal:


pip install fecfile

Usage

For the vast majority of filings, the easiest way to use this library will be to load filings all at once by using the from_http(file_number), from_file(file_path), or loads(input) methods.

These methods will return a Python dictionary, with keys for header, filing, itemizations, and text. The itemizations dictionary contains lists of itemizations grouped by type (Schedule A, Schedule B, etc.).

Examples:


import fecfile

import json



filing1 = fecfile.from_file('1229017.fec')

print(json.dumps(filing1, sort_keys=True, indent=2, default=str))



filing2 = fecfile.from_http(1146148)

print(json.dumps(filing2, sort_keys=True, indent=2, default=str))



with open('1229017.fec') as file:

    parsed = fecfile.loads(file.read())

    print(json.dumps(parsed, sort_keys=True, indent=2, default=str))



url = 'http://docquery.fec.gov/dcdev/posted/1229017.fec'

r = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})

parsed = fecfile.loads(r.text)

print(json.dumps(parsed, sort_keys=True, indent=2, default=str))

Note #1: the default=str parameter allows serializing to json for dictionaries like the ones returned by the fecfile library that contain datetime objects.

Note #2: the docquery.fec.gov urls cause problems with the requests library when a user-agent is not supplied. There may be a cleaner fix to that though.

Advanced Usage

For some large filings, loading the entire filing into memory like the above examples do would not be a good idea. For those cases, the fecfile library provides methods for parsing filings one line at a time.


import fecfile



version = None



with open('1263179.fec') as file:

    for line in file:

        if version is None:

            header, version = fecfile.parse_header(line)

        else:

            parsed = fecfile.parse_line(line, version)

            save_to_db(parsed)

API Reference

loads


loads(input)

Deserialize input (a str instance

containing an FEC document) to a Python object.

parse_header


parse_header(hdr)

Deserialize a str or a list of str instances containing

header information for an FEC document. Returns an Python object, the

version str used in the document, and the number of lines used

by the header.

The third return value of number of lines used by the header is only

useful for versions 1 and 2 of the FEC file format, when the header

was a multiline string beginning and ending with /*. This allows

us to pass in the entire contents of the file as a list of lines and

know where to start parsing the non-header lines.

parse_line


parse_line(line, version, line_num=None)

Deserialize a line (a str instance

containing a line from an FEC document) to a Python object.

version is a str instance for the version of the FEC file format

to be used, and is required.

line_num is optional and is used for debugging. If an error or

warning is encountered, whatever is passed in to line_num will be

included in the error/warning message.

from_http


from_http(file_number)

Utility method for getting a parsed Python representation of an FEC

filing when you don't already have it on your computer. This method takes

either a str or int as a file_number and requests it from

the docquery.fec.gov server, then parses the response.

from_file


from_file(file_path)

Utility method for getting a parsed Python representation of an FEC

filing when you have the .fec file on your computer. This method takes

a str of the path to the file, and returns the parsed Python object.

print_example


print_example(parsed)

Utility method for debugging - prints out a representative subset of

the Python object returned by one of the deserialization methods. For

filings with itemizations, it only prints the first of each type of

itemization included in the object.

Developing locally

Assuming you already have Python3 and the ability to create virtual environments installed, first clone this repository from github and cd into it:


git clone https://github.com/esonderegger/fecfile.git

cd fecfile

Then create a virtual environment for this project (I use the following commands, but there are several ways to get the desired result):


python3 -m venv ~/.virtualenvs/fecfile

source ~/.virtualenvs/fecfile/bin/activate

Next, install the dependencies:


python setup.py

Finally, make some changes, and run:


python tests.py

Thanks

This project would be impossible without the work done by the kind folks at The New York Times Newsdev team. In particular, this project relies heavily on fech although it actually uses a transformation of this fork.

Contributing

I would love some help with this, particularly with the mapping from strings to int, float, and datetime types. Please create an issue or make a pull request. Or reach out privately via email - that works too.

To do:

Almost too much to list:

~~Handle files from before v6 when they were comma-delimited~~
create a dumps method for writing .fec files for round-trip tests
add more types to the types.json file
elegantly handle errors

Changes

0.4.0 (October 2, 2018)

Updated documentation
add paper versions for schedule F

0.3.9 (October 1, 2018)

add paper versions for H1, H2, H3, and H4

0.3.8 (September 28, 2018)

add paper versions for the F1S

0.3.7 (September 27, 2018)

add paper versions of F1M
add paper versions for F3X
add F3P paper filing mappings

0.3.6 (September 27, 2018)

add F6 paper mappings and fix missing commas

0.3.5 (September 26)

add all paper versions of form F1

0.3.4 (September 18, 2018)

expose parse_header and parse_line to consumers of this library

0.3.3 (September 18, 2018)

add version 8.3 to mappings

0.3.2 (August 29, 2018)

versions 1 and 2 of schedule H1 and H2

0.3.1 (August 29, 2018)

added more mappings
add a method to determine which mappings are missing

0.3.0 (August 27, 2018)

Rework warnings and errors for cases where mappings are missing
add mappings

0.2.3 (August 24, 2018)

fix for filings that use both quotes and the field separator

0.2.2 (August 23, 2018)

add support for F13, F132, and F133

0.2.1 (August 21, 2018)

Fix regression that broke paper filings

0.2.0 (August 2, 2018)

Add parsing for versions 1 and 2 of the .fec format

0.1.9 (July 18, 2018)

add parsing for senate paper filings

0.1.8 (June 26, 2018)

interest rate should never have been a float field

0.1.7 (June 26, 2018)

handle n/a in number fields

0.1.6 (June 25, 2018)

more types
update documentation
handle percent signs in interest rates

0.1.5 (June 21, 2018)

Initial published version

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.9.1

May 3, 2025

0.9.0

Apr 24, 2025

0.8.0

Feb 25, 2023

0.7.2

Jul 23, 2022

0.7.1

Jul 16, 2022

0.7.0

May 17, 2022

0.6.4

Sep 3, 2020

0.6.3

Jun 3, 2019

0.6.2

Apr 24, 2019

0.6.1

Apr 11, 2019

0.6.0

Apr 10, 2019

0.5.3

Feb 12, 2019

0.5.2

Jan 19, 2019

0.5.1

Jan 18, 2019

0.5.0

Jan 17, 2019

0.4.11

Jan 12, 2019

0.4.10

Nov 7, 2018

0.4.9

Nov 6, 2018

0.4.8

Nov 6, 2018

0.4.7

Nov 2, 2018

0.4.6

Oct 29, 2018

0.4.5

Oct 27, 2018

0.4.4

Oct 17, 2018

0.4.3

Oct 10, 2018

0.4.2

Oct 9, 2018

This version

0.4.1

Oct 4, 2018

0.4.0

Oct 2, 2018

0.3.9

Oct 1, 2018

0.3.8

Sep 28, 2018

0.3.7

Sep 27, 2018

0.3.6

Sep 27, 2018

0.3.5

Sep 26, 2018

0.3.4

Sep 18, 2018

0.3.3

Sep 18, 2018

0.3.2

Aug 29, 2018

0.3.1

Aug 29, 2018

0.3.0

Aug 27, 2018

0.2.3

Aug 24, 2018

0.2.2

Aug 23, 2018

0.2.1

Aug 21, 2018

0.2.0

Aug 2, 2018

0.1.9

Jul 18, 2018

0.1.8

Jun 26, 2018

0.1.7

Jun 26, 2018

0.1.6

Jun 25, 2018

0.1.5

Jun 21, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fecfile-0.4.1.tar.gz (21.2 kB view details)

Uploaded Oct 4, 2018 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fecfile-0.4.1-py3-none-any.whl (21.1 kB view details)

Uploaded Oct 4, 2018 Python 3

File details

Details for the file fecfile-0.4.1.tar.gz.

File metadata

Download URL: fecfile-0.4.1.tar.gz
Upload date: Oct 4, 2018
Size: 21.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.4.5

File hashes

Hashes for fecfile-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`5f13ed285ad3632a8e5ba7b8ba23c747434aeab064b7d04f04fec88fbb287273`
MD5	`d7aada3a3b110ab9d5047d110f915b21`
BLAKE2b-256	`45d2658f87bacdd7cc6cbf5d194bafc88d6f9eb4d3eb4b17f554047cbb930e13`

See more details on using hashes here.

File details

Details for the file fecfile-0.4.1-py3-none-any.whl.

File metadata

Download URL: fecfile-0.4.1-py3-none-any.whl
Upload date: Oct 4, 2018
Size: 21.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.4.5

File hashes

Hashes for fecfile-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9bddeb4d9e492335b8fc64c7d068d824a78c85197dc4dc495fe0968e4773626f`
MD5	`93284e4b108319595a82340c84a1d67e`
BLAKE2b-256	`3dbc39c60ceeca8d53ec263c6c9449b2086b2800b53c7055222aa732ff47d24a`

See more details on using hashes here.

fecfile 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Why?

Installation

Usage

Examples:

Advanced Usage

API Reference

loads

parse_header

parse_line

from_http

from_file

print_example

Developing locally

Thanks

Contributing

To do:

Changes

0.4.0 (October 2, 2018)

0.3.9 (October 1, 2018)

0.3.8 (September 28, 2018)

0.3.7 (September 27, 2018)

0.3.6 (September 27, 2018)

0.3.5 (September 26)

0.3.4 (September 18, 2018)

0.3.3 (September 18, 2018)

0.3.2 (August 29, 2018)

0.3.1 (August 29, 2018)

0.3.0 (August 27, 2018)

0.2.3 (August 24, 2018)

0.2.2 (August 23, 2018)

0.2.1 (August 21, 2018)

0.2.0 (August 2, 2018)

0.1.9 (July 18, 2018)

0.1.8 (June 26, 2018)

0.1.7 (June 26, 2018)

0.1.6 (June 25, 2018)

0.1.5 (June 21, 2018)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes