Skip to main content

ETL pipeline for CDFI Fund and DOL public datasets — TLR, CLR, ILR, NMTC, Awards, and LCA (Labor Condition Application) data

Project description

cdfi-data 🏦

ETL pipeline for US Treasury CDFI Fund public datasets.

Download, clean, and analyze Transaction Level Report (TLR), Consumer Loan Report (CLR), and Awards data from the US Department of Treasury's CDFI Fund — in one line of Python.


Why cdfi-data?

The CDFI Fund releases massive public datasets covering millions of loans and investments in low-income communities. But the raw files are messy, inconsistently formatted, and require significant cleaning before analysis. cdfi-data standardizes the entire pipeline.


Installation

pip install cdfidata

Quickstart

from cdfidata import TLRLoader, CLRLoader, AwardsLoader

# Load TLR transaction data (downloads & caches automatically)
tlr = TLRLoader()
df = tlr.load(year=2022)

# Filter to Illinois
il = tlr.filter_state("IL")

# Filter by loan type and amount
small_biz = tlr.filter_loan_type("Business")
large = tlr.filter_amount(min_amount=500_000)

# Summary stats
tlr.summary()

# Export
tlr.to_csv("cdfi_transactions.csv")
tlr.to_sqlite("cdfi.db", table="tlr")

Sample Data (No Download Required)

from cdfidata import TLRLoader, CLRLoader, AwardsLoader

tlr = TLRLoader()
df = tlr.load_sample(n=1000)

clr = CLRLoader()
df = clr.load_sample(n=1000)

awards = AwardsLoader()
df = awards.load_sample(n=500)

Datasets Supported

Dataset Source Description
TLR (Transaction Level Report) CDFI Fund 1M+ individual CDFI loans, 61 variables
CLR (Consumer Loan Report) CDFI Fund 3.2M consumer loans aggregated to census tract
Awards Database CDFI Fund All CDFI Fund program awardees across all years
LCA (Labor Condition Application) DOL OFLC H-1B visa filing disclosure data, quarterly

Coming soon: ILR (Institution Level Report), NMTC Allocatee data


Data Sources

CDFI Fund datasets (TLR, CLR, Awards) come from the US Department of Treasury CDFI Fund: https://www.cdfifund.gov/research-data

LCA datasets come from the US Department of Labor, Office of Foreign Labor Certification (OFLC): https://www.dol.gov/agencies/eta/foreign-labor/performance

All data is released under open government data principles.


Running Tests

PYTHONPATH=. pytest tests/ -v

30 tests across all modules.


Who This Is For

  • Impact investors analyzing CDFI loan portfolios
  • Academic researchers studying community development finance
  • Policy analysts evaluating CDFI Fund program outcomes
  • CDFIs benchmarking their own performance against peers
  • Anyone who needs clean, analysis-ready CDFI Fund data

License

MIT 2026 Jaypatel1511

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdfidata-0.1.4.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cdfidata-0.1.4-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file cdfidata-0.1.4.tar.gz.

File metadata

  • Download URL: cdfidata-0.1.4.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for cdfidata-0.1.4.tar.gz
Algorithm Hash digest
SHA256 ddf1f7119ec367f10703d0e97e8e2267833b6ca9a03f1b60102d755a184c60b3
MD5 d6ac0f9ab1526088bb3ea0d1451ec2eb
BLAKE2b-256 36670fef4340831e8038afaf1715f6ed7058f9a628faad54a4b5b723f6065529

See more details on using hashes here.

File details

Details for the file cdfidata-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: cdfidata-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for cdfidata-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4e0c0224f88f17da83a63e73a0ea2f6dbfd642ca41568a941804219ec8827ec1
MD5 0f7b77be999c1256a2f0e759411d9997
BLAKE2b-256 0dc05764ef1cf0a58089ffaf66e318ecd2e3472e888fbb4115be408297d55e82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page