Skip to main content

An extremely fast FEC filing parser written in C

Project description

FastFEC

A C program to stream and parse Federal Election Commission (FEC) filings, writing output to CSV.

Installation

Download the latest release and place it on your path, or if you have Homebrew and are on Mac or Linux, you can install via:

brew install fastfec

You can also build a binary yourself following the development instructions below.

Usage

Once FastFEC has been installed, you can run the program by calling fastfec in your terminal:

Usage: fastfec [flags] <id or file> [output directory=output] [override id]
  • [flags]: optional flags which must come before other args; see below
  • <file or id> is either
    • a file, in which case the filing is read from disk at the specified local path
    • a numeric ID (only works with --print-url): prints the possible URLs the filing lives on the FEC docquery website
  • [output directory] is the folder in which CSV files will be written. By default, it is output/.
  • [override id] is an ID to use as the filing ID. If not specified, this ID is pulled out of the first parameter as a numeric component that can be found at the end of the path.

The CLI will read the specified filing from disk and then write output CSVs for each form type in the output directory. The paths of the outputted files are:

  • {output directory}/{filing id}/{form type}.csv

You can also pipe the output of another command in by following this usage:

[some command] | fastfec [flags] <id> [output directory=output]

Flags

The CLI supports the following flags:

  • --include-filing-id / -i: if this flag is passed, then the generated output will include a column at the beginning of every generated file called filing_id that gets passed the filing ID. This can be useful for bulk uploading CSVs into a database
  • --silent / -s : suppress all non-error output messages
  • --warn / -w : show warning messages (e.g. for rows with unexpected numbers of fields or field types that don't match exactly)
  • --no-stdin / -x: disable receiving piped input from other programs (stdin)
  • --print-url / -p: print URLs from docquery.fec.gov (cannot be specified with other flags)

The short form of flags can be combined, e.g. -is would include filing IDs and suppress output.

Examples

Parsing a local filing

fastfec -s 13360.fec fastfec_output/

  • This will run FastFEC in silent mode, parse the local filing 13360.fec, and store the output in CSV files at fastfec_output/13360/.

Downloading and parsing a filing

Get the FEC filing URL needed:

fastfec -p 13360

If you have curl installed, you can then run this command:

curl https://docquery.fec.gov/dcdev/posted/13360.fec | fastfec 13360
  • This will download the filing with ID 13360 from the FEC's servers and stream/parse it, storing the output in CSV files at output/13360/

If you don't have curl installed, you can also download the filing from the URL (https://docquery.fec.gov/dcdev/posted/13360.fec), save the file, and run (is equivalent to the above):

fastfec 13360.fec

Benchmarks

The following was performed on an M1 Macbook Air:

Filing Size Time Memory usage CPU usage
1464847.fec 8.4gb 1m 42s 1.7mb 98%

Local development

Build system

Zig is used to build and compile the project. Download and install the latest version of Zig (>=0.11.0) by following the instructions on the website (you can verify it's working by typing zig in the terminal and seeing help commands).

Dependencies

FastFEC has no external C dependencies. PCRE is bundled with the library to ensure compatibility with Zig's build system and cross-platform compilation.

Building

From the root directory of the repo, run:

zig build
  • The above commands will output a binary at zig-out/bin/fastfec and a shared library file in the zig-out/lib/ directory
  • If you want to only build the library, you can pass -Dlib-only=true as a build option following zig build
  • You can also compile for other operating systems via -Dtarget=x86_64-windows (see here for additional targets)

Testing

Currently, there's C tests for specific parsing/buffer/write/CLI functionality and Python integration tests.

  • Running the C tests: zig build test
  • Running the Python tests:
    cd python
    pip install -r requirements-dev.txt
    tox -e py
    

See the GitHub test workflow for more info

Scripts

python scripts/generate_mappings.py: A Python script to auto-generate C header files containing column header and type mappings

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

fastfec-0.2.0-py3-none-win_arm64.whl (356.0 kB view details)

Uploaded Python 3 Windows ARM64

fastfec-0.2.0-py3-none-win_amd64.whl (284.2 kB view details)

Uploaded Python 3 Windows x86-64

fastfec-0.2.0-py3-none-manylinux2014_aarch64.whl (456.5 kB view details)

Uploaded Python 3

fastfec-0.2.0-py3-none-manylinux1_x86_64.whl (345.3 kB view details)

Uploaded Python 3

fastfec-0.2.0-py3-none-macosx_11_0_arm64.whl (356.9 kB view details)

Uploaded Python 3 macOS 11.0+ ARM64

fastfec-0.2.0-py3-none-macosx_10_9_x86_64.whl (272.6 kB view details)

Uploaded Python 3 macOS 10.9+ x86-64

File details

Details for the file fastfec-0.2.0-py3-none-win_arm64.whl.

File metadata

  • Download URL: fastfec-0.2.0-py3-none-win_arm64.whl
  • Upload date:
  • Size: 356.0 kB
  • Tags: Python 3, Windows ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for fastfec-0.2.0-py3-none-win_arm64.whl
Algorithm Hash digest
SHA256 ae0ae78468e1da384fa4fadb9b8f8f74c8955bd2751cc78de04c1b2617d165b7
MD5 715766d59cd0cf64d37d562fc0a5e2e4
BLAKE2b-256 c3016cb92dd7c4ab1ac02417c6aeec7b8d2ad63f57b073a1d66b88355b226428

See more details on using hashes here.

File details

Details for the file fastfec-0.2.0-py3-none-win_amd64.whl.

File metadata

  • Download URL: fastfec-0.2.0-py3-none-win_amd64.whl
  • Upload date:
  • Size: 284.2 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for fastfec-0.2.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 9c9c833db7673b2b7c38c8902909338f34e194a3b182a48d704545697b2ae4e9
MD5 4dc7634180276401dbdd5a4e51132c8a
BLAKE2b-256 20cef61445a1fbc0eaa090e9207fa18b2855915a75853d88cb39a475afec4fe3

See more details on using hashes here.

File details

Details for the file fastfec-0.2.0-py3-none-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastfec-0.2.0-py3-none-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7b054d5e8856c313566617c59569542e5bf6b482eba91a8338c987c586e66378
MD5 b1cf12241caf7d60049dfa3f8bf62767
BLAKE2b-256 463e1590e852cebdd1487ff1cc62e8df29d453bf4847b5abed50e6fc63a18392

See more details on using hashes here.

File details

Details for the file fastfec-0.2.0-py3-none-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for fastfec-0.2.0-py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0c0f668f61cf38d1e8b8f7471678313a70bb74c6dbc7e35d61bcf0e172bdf0a0
MD5 67b739e6c62a281f43eef4954ff2ab02
BLAKE2b-256 2394dbb4f2940bd780ff4f8a9a5b3da3ab62362e750a86b32e516135f5ab5daa

See more details on using hashes here.

File details

Details for the file fastfec-0.2.0-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fastfec-0.2.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 26788f4a3b4af9e67e31ccffc64a5169fc05f1fc791dc4834c032ef4158c87a9
MD5 1ce57de059f4e40386753b006699766c
BLAKE2b-256 f9d24b05325ed268339ef45db7dbaa8e5fc63718bdc9a73a97718338c86baab5

See more details on using hashes here.

File details

Details for the file fastfec-0.2.0-py3-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for fastfec-0.2.0-py3-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d2ed85c4799bfd4e344f8550d9688df23deb08e318ef2734fe5d6dbe69eecf48
MD5 3771f12f9f5b3b736b427ab50ca14914
BLAKE2b-256 6c41141b6be2bff3391274a7cac737137479fd7b76e40517ffc8e0c049f2af7a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page