A reproducible workflow for the FAA aircraft registry
Project description
Hangarbay
Stop struggling with FAA registration data. Get clean, queryable aircraft registry tables in seconds.
Hangarbay downloads, normalizes and indexes the complete FAA aircraft registry so you can skip straight to analysis. No more hunting for files to download, or joining fragmented tables, or decoding cryptic lookup relationships. One command gets you 307,000+ aircraft registrations, owners and specifications ready to query with SQL or pandas.
This is a tool built for researchers, data journalists and aviation analysts who need fast, reliable and repeatable workflows.
Quick start
Requirements: Python 3.9+
Python API (notebooks & scripts)
# Install
pip install hangarbay
# One-time setup
import hangarbay as hb
hb.load_data() # Downloads & processes FAA data to ~/.hangarbay/data/
# Look up aircraft
df = hb.search("N221LA")
# Find fleets
fleet = hb.fleet("United Airlines")
fleet = hb.fleet("LAPD|Los Angeles Police", state="CA")
# Custom SQL
df = hb.query("""
SELECT maker, COUNT(*) as count
FROM aircraft_decoded
WHERE year_mfr > 2020
GROUP BY maker
ORDER BY count DESC
""")
# Check data age
info = hb.status()
print(f"Data is {info['age_days']} days old")
Command-line interface
# Install
pip install hangarbay
# Run the full pipeline (< 1 minute, depending on your connection)
hangar update # Download, normalize and publish (in one step)
# Or run steps individually
hangar fetch # Download FAA data
hangar normalize # Parse to typed Parquet tables
hangar publish # Build DuckDB + SQLite FTS indexes
# Check data status and age
hangar status
# Start querying!
hangar search N757AF
hangar sql "SELECT COUNT(*) FROM aircraft"
What this does
Hangarbay downloads FAA aircraft registration data, normalizes it into typed tables and provides fast SQL querying:
- 307K+ aircraft registrations with owner and address information
- 93K+ make/model references (Cessna, Piper, Boeing, etc.)
- 4,700+ engine specifications (horsepower, type, manufacturer)
- DuckDB for analytical SQL queries (sub-second on 300K+ rows)
- SQLite FTS5 for full-text search by owner name or address
- Parquet files for efficient columnar storage
- Data lineage tracking with SHA256 checksums and version metadata
- Shared data directory at
~/.hangarbay/data/- works from any project or notebook
Features
N-Number lookup
Look up any aircraft registration with decoded status codes and human-readable output:
hangar search N221LA
Fleet search
Find all aircraft owned by a person or company:
# Search by owner name (case-insensitive)
hangar fleet "United Airlines"
hangar fleet "LAPD"
hangar fleet "NetJets"
# OR logic with pipe separator (great for finding multiple airlines)
hangar fleet "LAPD|Los Angeles Police" # Either pattern matches
hangar fleet "Delta|American|United" # Any of the three
# Filter by state
hangar fleet "Delta" --state GA
# Export to CSV
hangar fleet "Boeing" --export boeing_fleet.csv
# Limit results
hangar fleet "Cessna" --limit 20
Aircraft Registration: N221LA
Owner: LAPD AIR SUPPORT DIVISION
Location: LOS ANGELES, CA
Make & Model: AIRBUS HELICOPTERS INC AS350B3
Year Manufactured: 2014
Serial Number: 7900
Registration Status: Valid
Certificate Type: Standard Airworthiness - Normal
Status Date: May 19, 2023
Expiration: Jun 30, 2028
SQL queries
Execute analytical queries with pretty output:
# Total aircraft
hangar sql "SELECT COUNT(*) FROM aircraft"
# Use decoded views for readable output
hangar sql "SELECT * FROM aircraft_decoded WHERE year_mfr > 2020 LIMIT 10"
hangar sql "SELECT * FROM owners_clean WHERE owner_name LIKE '%boeing%'" -i
# Top manufacturers
hangar sql "SELECT maker, COUNT(*) as count FROM aircraft
JOIN aircraft_make_model USING(mfr_mdl_code)
WHERE maker != '' GROUP BY 1 ORDER BY 2 DESC LIMIT 10"
# Lookup status codes
hangar sql "SELECT * FROM status_codes"
# Top states by registrations
hangar sql "SELECT state_std, COUNT(*) as count FROM owners
WHERE state_std != '' GROUP BY 1 ORDER BY 2 DESC LIMIT 10"
Output Formats:
# Pretty table (default)
hangar sql "SELECT * FROM status_codes LIMIT 3"
# JSON output (for APIs/scripts)
hangar sql "SELECT * FROM status_codes LIMIT 3" --output-format json
# CSV output (for Excel/spreadsheets)
hangar sql "SELECT * FROM status_codes LIMIT 3" --output-format csv
Available tables
Core Tables:
- aircraft - Current registration facts (N-number, serial, make/model, year, status)
- registrations - Canonical registration state (type, status, dates)
- owners - Owner records with raw and standardized addresses
- owners_summary - Aggregated ownership (co-owners, counts)
- aircraft_make_model - Make/model reference (Cessna 172, Boeing 737, etc.)
- engines - Engine specifications (manufacturer, horsepower, type)
Decoded Views (Recommended):
- aircraft_decoded - Aircraft with decoded status codes and joined make/model
- owners_clean - Simplified owner table (standardized fields only, decoded owner type)
Reference Tables:
- status_codes - Registration status code lookups
- airworthiness_classes - Airworthiness certificate class lookups
- owner_types - Owner type code lookups
Data quality
- Type safety: PyArrow schemas enforce correct data types (dates, integers, strings)
- Address standardization: Uppercase, state/ZIP normalization, whitespace cleanup
- Deterministic IDs: xxhash64 ensures consistent owner IDs across pipeline runs
- File verification: SHA256 checksums validate all downloads
- Version tracking: Manifest records schema versions and row counts for each snapshot
Architecture
FAA Data (MASTER, ACFTREF, ENGINE)
↓ fetch (with SHA256 verification)
Raw Text Files (275 MB)
↓ normalize (PyArrow type casting)
Typed Parquet Tables (36 MB)
↓ publish (indexing)
DuckDB (106 MB) + SQLite FTS (55 MB)
↓ query
CLI / Python API
Development
To contribute or modify the code:
# Clone and install for development
git clone https://github.com/stiles/hangarbay.git
cd hangarbay
pip install -e ".[dev]"
# Run tests
make test
# Update data
make update # or: hangar update
# Check data status
make status # or: hangar status
# Full pipeline (individual steps)
make all # fetch, normalize, publish, verify
# Individual steps
make fetch
make normalize
make publish
# Clean intermediate files
make clean
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hangarbay-0.5.0.tar.gz.
File metadata
- Download URL: hangarbay-0.5.0.tar.gz
- Upload date:
- Size: 30.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa8499ed49d578982fadd5cdbc945a309669978bbdcaac84c2c68c2147723a20
|
|
| MD5 |
542396a59abe4871bd420e0274e4d7c2
|
|
| BLAKE2b-256 |
5f1870681e44e187f876ae863689613ee5774019e21e6929acebcdd1d08c50d5
|
File details
Details for the file hangarbay-0.5.0-py3-none-any.whl.
File metadata
- Download URL: hangarbay-0.5.0-py3-none-any.whl
- Upload date:
- Size: 30.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79a8f9ea45ce5ff3ca07b584cf3332dd664f432b1d55ee95cba40a878f75072f
|
|
| MD5 |
8a97c99d0543c6c0aafde040a4026875
|
|
| BLAKE2b-256 |
e8df42ca3e90ada2abbdc39e335b3693c99441a0235cdcbcd8a74ce23acd7e2d
|